Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netapp OnTap ems logs go to sc4s:fallback #2610

Open
DavidLopez-jr opened this issue Oct 3, 2024 · 9 comments
Open

Netapp OnTap ems logs go to sc4s:fallback #2610

DavidLopez-jr opened this issue Oct 3, 2024 · 9 comments
Assignees

Comments

@DavidLopez-jr
Copy link

Note: If your issue is not a bug or a feature request, please raise a support ticket through our support portal (Splunk.com > Support > Support Portal). This will help us resolve your issue more efficiently and provide you with better assistance. For more information on how to work with the Splunk Support, please refer to this guide.

**Was the issue replicated by support? No

**What is the sc4s version ? 3.26.1

**Which operating system (including its version) are you using for hosting SC4S?REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.9"

**Which runtime (Docker, Podman, Docker Swarm, BYOE, MicroK8s) are you using for SC4S? Podman

**Is there a pcap available? If so, would you prefer to attach it to this issue or send it to Splunk support? I do not have a pcap available but can provide tcpdump.

**Is the issue related to the environment of the customer or Software related issue? No

**Is it related to Data loss, please explain ?No
Protocol? none Hardware specs? none

**Last chance index/Fallback index? wireformat:rfc|wireformat:rfc3164|vps|.app.app-fallbackz-lastchance|.app.app-fix-invalid-program-z_bsdconvention|ns_vendor:netapp|ns_product:ontap|.source.s_NETAPP_ONTAP

**Do we have all the default indexes created? No

Describe the bug
A clear and concise description of what the bug is.
Our SC4S host is collecting events from a NetApp OnTap host sending ems logs. The logs are being sent to sourcetype=sc4s:fallback.
It appears the SC4S filter for the source is misconfigured. It appears the filtering/parser was based on audit logs.

Steps to reproduce the behavior:

  1. Go to 'send netapp ems logs'
  2. Click on 'perform splunk search'
  3. Scroll down to 'view results'
  4. See error
@cwadhwani-splunk cwadhwani-splunk self-assigned this Oct 8, 2024
@cwadhwani-splunk
Copy link
Collaborator

Hi @DavidLopez-jr
Could you please provide the PCAP file, so that we can look at the raw logs and try to reproduce the issue on our environment. Please create a support ticket and attach the pcap file/tcpdump, so that we can ge the raw logs to move forward.

@DavidLopez-jr
Copy link
Author

DavidLopez-jr commented Oct 8, 2024 via email

@cwadhwani-splunk
Copy link
Collaborator

Hi @DavidLopez-jr
I checked the existing parser written for netapp:ontap (ontap:ems), the log that has been provided by you seems a bit different than the sample log that we have used for writting the parser. We will need a pcap file to confirm the format of the ems logs coming from the netapp ontap.
I tried to go online and find a sample syslog log of ems type but could not find it, so a pcap file would be helpful.
We can check and update our parser if required.

Also, looking at the tags that you have provided in the GitHub issue,

wireformat:rfc|wireformat:rfc3164|vps|.app.app-fallbackz-lastchance|.app.app-fix-invalid-program-z_bsdconvention|ns_vendor:netapp|ns_product:ontap|.source.s_NETAPP_ONTAP

it seems like you are using some additional env parameters and local parsers. We will need the env file and the local parsers (/opt/sc4s/local folder) to check what you are facing.

Please provide these details (PCAP, env file (redact the sensitive details), local parsers) on the support case that you opened. Support can help you to get these details.

@cwadhwani-splunk
Copy link
Collaborator

We reviewed the current parser for NetApp ONTAP within SC4S. The existing parser is designed for the RFC 3164 syslog format, the parser expects the date and time within the message itself. However, the logs you've shared are in RFC 5424 format, and it doesn't meet the parser's criteria. As a result, the current parser isn't applicable to the current syslog and sourcetype, and the field structures differ.

To address this, we can provide a postfilter for local usage that can correctly classify the shared logs. Note that this post-filter assumes that NetApp ONTAP logs are sent to a specific port as it is currently configured in your env file (SC4S_LISTEN_NETAPP_ONTAP_UDP_PORT=5090).

Alternatively, you could check if there's a way to configure NetApp ONTAP to send logs in RFC 3164 format. This would eliminate the need for an additional parser.

Here is the postfilter:

Add a file in: /opt/sc4s/local/config/app_parsers/app-postfilter-netapp_ontap.conf

block parser app-postfilter-netapp_ontap() {
    channel {
        rewrite {
            r_set_splunk_dest_default(
                index("infraops")
                sourcetype('ontap:ems')
                vendor("netapp")
                product("ontap")
                class('ems')
            );
        };
    };
};

application app-postfilter-netapp_ontap[sc4s-postfilter] {
    filter {
        match("netapp", value('.netsource.sc4s_vendor'), type(string))
        and filter(f_is_rfc5424);
    };  
    parser { app-postfilter-netapp_ontap(); };
};

Please make sure to restart your sc4s service after making these changes.

@cwadhwani-splunk
Copy link
Collaborator

As the parser is provided, we are closing this GitHub issue, please feel free to reopen the case if you have any further queries.

@Shreeraj-Splunk
Copy link

Hi @cwadhwani-splunk, as per the @DavidLopez-jr, the issue persists with the latest version of the SC4S, please re-open the GitHub issue.

The below sample logs shared by the customer along with their expectations on how fields should get extracted.

<5>Jan 15 09:26:00 [ontapnode-03:raid.spares.media_scrub.start:notice]: owner="", disk_info="Disk 0d.10.22P3 Shelf 10 Bay 22 [NETAPP X357_S163A3T8ATE NA54] S/N [S394NA0J101524NP003] UID [6002538A:07152820:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000]", blockNum="5248", shelf="10", bay="22", vendor="NETAPP ", model="X357_S163A3T8ATE", firmware_revision="NA54", serialno="S394NA0J101524NP003", disk_type="5", disk_rpm="N/A", carrier="", site="Local"

<5>Jan 15 09:26:00 [ontapnode-03:raid.spares.media_scrub.start:notice]: owner="", disk_info="Disk 0a.10.23P3 Shelf 10 Bay 23 [NETAPP X357_S163A3T8ATE NA54] S/N [S394NA0HC16400NP003] UID [6002538A:06CC69F0:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000]", blockNum="5248", shelf="10", bay="23", vendor="NETAPP ", model="X357_S163A3T8ATE", firmware_revision="NA54", serialno="S394NA0HC16400NP003", disk_type="5", disk_rpm="N/A", carrier="", site="Local"

<5>Jan 15 09:26:00 [ontapnode-03:raid.spares.media_scrub.start:notice]: owner="", disk_info="Disk 0d.10.22P3 Shelf 10 Bay 22 [NETAPP X357_S163A3T8ATE NA54] S/N [S394NA0J101524NP003] UID [6002538A:07152820:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000]", blockNum="5248", shelf="10", bay="22", vendor="NETAPP ", model="X357_S163A3T8ATE", firmware_revision="NA54", serialno="S394NA0J101524NP003", disk_type="5", disk_rpm="N/A", carrier="", site="Local"

EMS = ONTAP Event Management System

Datetime: Jan 15 09:26:00

Hostname: ontapnode-03

EMS Category: raid.spares.media_scrub.start

Event Level: notice

Event Message: owner="", disk_info="Disk 0d.10.22P3 Shelf 10 Bay 22 [NETAPP X357_S163A3T8ATE NA54] S/N [S394NA0J101524NP003] UID [6002538A:07152820:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000]", blockNum="5248", shelf="10", bay="22", vendor="NETAPP ", model="X357_S163A3T8ATE", firmware_revision="NA54", serialno="S394NA0J101524NP003", disk_type="5", disk_rpm="N/A", carrier="", site="Local"

Below is the sc4s_tag value for rfc3164.

syslog.invalid_hostname|wireformat:rfc|wireformat:rfc3164|source_identified|vps|.app.app-netsource-netapp_ontap|.app.app-fix-invalid-program-z_bsdconvention|ns_vendor:netapp|ns_product:ontap|.source.s_NETAPP_ONTAP

Customer pointed out the "syslog.invalid" thing from the above mentioning that parser is still not working.

Let me know if you need anything else.

@Shreeraj-Splunk
Copy link

Hello @cwadhwani-splunk,

Requirement 1
As discussed with the customer they are expecting below, need the below fields extracted for the format rfc3164.

Event Message:
owner="", disk_info="Disk 0d.10.22P3 Shelf 10 Bay 22 [NETAPP X357_S163A3T8ATE NA54] S/N [S394NA0J101524NP003] UID [6002538A:07152820:500A0981:00000003:00000000:00000000:00000000:00000000:00000000:00000000]", blockNum="5248", shelf="10", bay="22", vendor="NETAPP ", model="X357_S163A3T8ATE", firmware_revision="NA54", serialno="S394NA0J101524NP003", disk_type="5", disk_rpm="N/A", carrier="", site="Local"

Hostname:
ontapnode-03

EMS:
ONTAP Event Management System

Datetime:
Jan 15 09:26:00

EMS Category:
raid.spares.media_scrub.start

Event Level:
notice

Requirement 2
Need a brief in the SC4S documentation for which format should be used and how the rfc5424 format would be a good choice.

Please detail the technical benefits of this format if it is preferred.

For either log format, we ask that the SC4S documentation clearly details the configuration requirements and steps to configure ONTAP properly for the parser.

ONTAP EMS destination documentation for this is at: https://docs.netapp.com/us-en/ontap-cli//event-notification-destination-create.html

ONTAP Auditlog destination documentation is at: https://docs.netapp.com/us-en/ontap-cli/cluster-log-forwarding-create.html

Please ask if you wish for us to review any of the ONTAP commands.

Thnaks,

@rjha-splunk
Copy link
Collaborator

RE: Requirement 2 : RFC3164 and RFC5424 are protocols in which syslog messages are built , there are protocols to send it as well, it is out of scope of SC4S docs to explain them, we explain how to ingest them in right manner in our document.

@cwadhwani-splunk
Copy link
Collaborator

cwadhwani-splunk commented Feb 6, 2025

@Shreeraj-Splunk
Thanks for the update, we will check the requirement 1 and will provide you an update. Also, in my understanding, the logs are now getting classified (index and sourcetype) correctly, the issue is with the extraction of the fields, correct me if I am wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants