Gone missing : my logs

Hello guys,

Context :
I'm working on some custom parsers for some logs that cannot be made native-SecOps-parsers-compliant. Once the parser is done, I need to validate it against a large number of logs. In order to do so, I export a few tens of thousand of raw logs from the previous SIEM and ingest them through the ingestion API. These raw logs can spread over several days to ensure a sufficient number of them.

Problem :
Even though the default "Ingestion dashboard" lists the correct number of log lines ingested (48976), I can only see 1200 lines in raw log search. They are all from today, at the correct time from the payload. I see no previous events even though I ingested raw logs from the last 30 days or so.

Where did my logs go ?

Am I doing something wrong or missing some SecOps concept ? Maybe I cannot ingest "too old" logs ? If so, what is the "too old" threshold ?

NB:

  • I have no other choice than manually ingesting logs from the previous SIEM to get them into Chronicle and work on the parser, no avenue to setup a direct and continuous collection.
  • When initially importing a couple thousands (5000-10000) logs to start parsing, I don't have this issue, they all appear in raw log search, at the time of ingestion (since not parsed yet)
  • From the Ingestion Dashboard, I can see that every line from the "validation logs" is correctly parsed, the Parsing Error count, etc, all display 0

Thanks in advance for your inputs ๐Ÿ™‚

Solved Solved
0 3 153
1 ACCEPTED SOLUTION

Thanks for the update @chrisd2. RLS prioritizes real-time data, so if the newly parsed logs were not current, then RLS can take some time to process those old logs. UDM searches returns all parsed events regardless of how old they are. That's the only thing I can think of that happened here. I hope that helps.

View solution in original post

3 REPLIES 3


@chrisd2 wrote:

. They are all from today, at the correct time from the payload


Is "today" the ingestion timestamp or the event timestamp? How are you searching for the logs? Are you doing raw log search or UDM search? I am assuming the data is also now parsed by the custom parsers you developed. If the logs are not parsed and the event timestamp is not from the day the logs were ingested, then you need to expand your search. You can try to do a UDM search and filter by log source and ingestion timestamp.


 

Hello @Rene_Figueroa ,

Sorry if it was not clear, by "today" I meant the event timestamp.

To sum up :

  • I ingested some logs (~10.000) prior to parser development --> they all appeared isntantly in Chronicle at ingestion timestamp as unparsed, OK !
  • I developed the parser on the basis of these events
  • I made a validation ingestion (~50.000 lines of logs) --> Only ~1000 logs appear in raw log search, correctly parsed, with their event timestamp correctly set, and I can't find the other events (neither as parsed nor unparsed). Yet the ingestion dashboard showed me that  ~50.000 logl lines were well received.

I used raw log search (regex "." with the correct log_type) to check if my events were present. The time span was correctly set to check for several days, but I could only see roughly 1000 evts.

I checked again this morning and the other events now appear in raw logs search, corecctly parsed and placed correctly on the timeline at their event timestamp.

Yet I'm curious if this is expected behavior; and if they are delays due to a large volume of logs arriving at the same time, how parsing priorization is done by Chronicle.

Thanks for the update @chrisd2. RLS prioritizes real-time data, so if the newly parsed logs were not current, then RLS can take some time to process those old logs. UDM searches returns all parsed events regardless of how old they are. That's the only thing I can think of that happened here. I hope that helps.