AWS Community Day Davao 2025

Simple Regex for Logging: An A-to-Z Guide

Jeon Il Shin

Jeon Il Shin

· 3 min read
Thumbnail

Simple Regex for Logging

Regular Expressions (RegEx) may look very complicated at first glance, but once you understand what they mean, they're actually very simple!

Original Log: Hello Im Jeon
Regular Expression: ^(?<data1>[^ ]*) (?<data2>[^ ]*) (?<data3>[^ ]*)$
Result: data1=Hello, data2=Im, data3=Jeon

The above expression simply splits a space-separated string into three values (Hello, Im, Jeon).

Let’s break it down character by character to understand how it's possible.

1. ^ and $

^.....$
  • ^ at the beginning of a regular expression means start of the line.
  • $ at the end of the expression means end of the line.

In logging, this implies "the data must be in this structure from beginning to end" rather than "it’s okay if there’s more content."

Think of it as putting a lid in a pipe.

Image

In the image above, since ^ and $ are at the start and end of the pattern, and the pattern is set to only match data1, data2, and data3, only these strings are allowed. However, in the second pipe, there are more than three data entries, so the logger will treat it as an invalid log and ignore it.

If you want to accept only the first three and ignore any additional data, you can remove the end anchor $.

Image
  • $ If you remove the end character, you can see that it retrieves more values better.
  • ^ During logging, you can assume that it almost never does anything.

2. ( , ( ? and )

In regular expressions, ( ) is not meaningless like in programming languages. It is a special character used to group values.

Image

To understand better, let’s first look at the [ ? ] symbol attached to the group number [ (? ].

Each capture group is numbered from the front group 1, group 2, group 3, and so on. This means each is stored accordingly.

What happens if there’s no [ ? ] symbol?

Image

The [ ? ] symbol indicates "capture exactly one that matches the condition."
If it is omitted, it becomes a statement to capture everything that matches the condition, so group 1 would consume all of it.

In a typical logging environment, you must use [ (? ] properly.

3. <Name>

The <name> symbol works just as you'd expect it assigns a name to each capture group.

Image

4. [^ ]*

The [^ ]* pattern helps you define how far to capture when matching.

In this case, it’s written as [^ ]*, which means:
“Please capture everything that is not a space.”

Put more simply, it means:
“I’ll keep capturing until I hit a space.”

Image

In this way, using [^ ]*, you can read until a space appears in the log.

5. Application

Let’s apply the above patterns to analyze the following log!

A simple log separated by Spaces

Original Log: 2025/07/21:23:15:00 GET /healthcheck 502
Regular Expression: ^(?<time>[^ ]*) (?<method>[^ ]*) (?<path>[^ ]*) (?<status_code>[^ ]*)$
Result: time=2025/07/21:23:15:00, method=GET, path=/healthcheck, status_code=502
Image

A simple log separated by Commas

Original Log: 2025/07/21:23:15:00,GET,/healthcheck,502
Regular Expression: ^(?<time>[^,]*),(?<method>[^,]*),(?<path>[^,]*),(?<status_code>[^,]*)$
Result: time=2025/07/21:23:15:00, method=GET, path=/healthcheck, status_code=502
Image

A simple log separated by multiple Delimiters

Original Log: 2025/07/21:23:15:00 GET,/healthcheck 502
Regular Expression: ^(?<time>[^ ]*) (?<method>[^,]*),(?<path>[^ ]*) (?<status_code>[^ ]*)$
Result: time=2025/07/21:23:15:00, method=GET, path=/healthcheck, status_code=502
Image

When there is a value you don't need

Original Log: 2025/07/21:23:15:00 hello.com GET /healthcheck 502
Regular Expression: ^(?<time>[^ ]*) [^ ]* (?<method>[^ ]*) (?<path>[^ ]*) (?<status_code>[^ ]*)$
Result: time=2025/07/21:23:15:00, method=GET, path=/healthcheck, status_code=502
Image

When wrapped in square brackets

Original Log: [2025/07/21:23:15:00] GET /healthcheck 502
Regular Expression: ^\[(?<time>[^\]]*)\] (?<method>[^ ]*) (?<path>[^ ]*) (?<status_code>[^ ]*)$
Result: time=2025/07/21:23:15:00, method=GET, path=/healthcheck, status_code=502
Image

When enclosed in quotation marks

Original Log: 2025/07/21:23:15:00 "GET /healthcheck 502"
Regular Expression: ^(?<time>[^]*) "(?<method>[^ ]*) (?<path>[^ ]*) (?<status_code>[^"]*)"$
Result: time=2025/07/21:23:15:00, method=GET, path=/healthcheck, status_code=502
Image

6. Parsing Real Application Logs

Image
Jeon Il Shin

About Jeon Il Shin

Jeon Il Shin is the CTO and co-founder of Amixtra. He leads all technical operations, overseeing the development and implementation of the company’s core technologies. He is responsible for driving innovation, managing the engineering team, and ensuring that Amixtra’s products are reliable, scalable, and cutting-edge. His technical expertise and vision play a crucial role in shaping Amixtra’s solutions and maintaining the company’s reputation for excellence in the tech industry.

Amixtra © 2024-2025 Amixtra. All rights reserved.