Regex guide#
To identify source video data and ephys data, Cheese3D uses “regular expressions”. These are data strings that specify a pattern to search for in filenames. Though powerful, they can be daunting for beginners. Cheese3D has several features to make this process easier.
What is a regular expression?#
A regular expression (regex) is like a search pattern that can match multiple similar text strings. Think of it as a template that describes what your filenames look like.
For example, if your video files are named like:
mouse1_cal_TL_001.avimouse1_behavior_TR_002.avimouse2_cal_BC_001.avi
You could describe this pattern as:
starts with anything:
.*,then underscore:
_,then type of video:
[behavior|cal],then underscore:
_,then camera view:
[TL|TR|BC|TC|L|R],then underscore:
_,then numbers:
\d+,then “.avi”:
\.avi
Putting it all together, we get the regex: .*_[behavior|cal]_[TL|TR|BC|TC|L|R]_\d+\.avi. For the rest of this guide, we will explain how to understand and build such expressions in more detail. For a more comprehesive explanation, see this guide.
Common regex patterns#
Here are the most common regex symbols you’ll encounter:
Pattern |
Meaning |
Example use |
Example matches |
|---|---|---|---|
|
Any single character |
|
abc, a1c, a-c |
|
Zero or more of previous pattern |
|
ac, abc, abbc |
|
One or more of previous pattern |
|
abc, abbc (not ac) |
|
Any of the characters a, b, or c |
|
TR, LR |
|
Any character except a, b, or c |
|
mouse1, cal (not _) |
|
OR operator |
|
TL, TR, or BC |
|
Literal dot (escaped) |
|
file.avi |
|
Any digit 0-9 |
|
1, 123, 007 |
|
Any word character (letter, digit, or underscore) |
|
mouse1, cal, TL |
|
Any characters (wildcard) |
|
anything ending in .avi |
Named groups in Python regex#
Python regex supports named groups which let you capture and reference specific parts of the match by name instead of position.
Syntax: (?P<name>pattern) creates a named group called “name”
Example:
(?P<mouse>mouse\d+)_(?P<type>\w+)_(?P<view>TL|TR|BC)_(?P<session>\d+)\.avi
This would match mouse1_cal_TL_001.avi and capture:
Key |
Pattern |
Matched value |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Cheese3D configuration dictionary format#
Cheese3D allows you to define regex patterns using a more readable dictionary format instead of writing complex regex strings directly.
Dictionary structure#
video_regex:
_path_: ".*_{{type}}_{{view}}_{{session}}.*\.avi"
type: "[^_]+"
view: "TL|TR|L|R|TC|BC"
session: "\d+"
Required Keys:
_path_: The main filename regex pattern with{{placeholders}}for groupsview: Camera view pattern (required for multi-camera setups)
Optional Keys: any additional groups you want to capture (type, session, etc.)
Internally, Cheese3D will build the corresponding regex string for you. For example, the transformation looks something like this:
# Dictionary input:
{
"_path_": ".*_{{type}}_{{view}}.*\.avi",
"type": "[^_]+",
"view": "TL|TR|L|R|TC|BC"
}
# Becomes Python regex:
".*_(?P<type>[^_]+)_(?P<view>TL|TR|L|R|TC|BC).*\.avi"
Using regex groups in Cheese3D configuration#
Once defined, the regex groups can be used throughout your configuration:
In sessions (see reference/configuration:Recording options):
sessions:
- name: session1
type: behavior # Filters files where type group = "behavior"
- name: session2
type: cal # Filters files where type group = "cal"
In calibration (see Calibration options):
calibration:
type: cal # Uses files where type group = "cal"
This allows you to organize different types of videos (calibration vs. behavior) and different sessions within the same directory structure.