delimited string a string broken into substrings by delimiters (duh) balanced string uses some kind of brace as delimiter opening brace has closing brace that balances they nest matching some doublequoted string first try: /".*"/g no good if multiple "foo" on one line -- you get: "foo" bar "foo" then: /".*"/gs no good, breaks at linebreak /"[^"]*"/g - ok /".*?"/gs - ok you'd think ^" is faster but it isn't because of the optimizer with .*? it finds the quotes, starts there with a char class, it can't if you use /["].*?["]/gs it slows down (a lot) now let's look for questions double-quoted string with ? before closequote /".*?\?"/gs is no good; will extend quote past correct end-quote to first ? /"[^"]*\?"/gs - works what if delimeter can be escaped? previous regex won't work /"(?:[^"]|\\")*"/g -- no good; the \ in \" gets matched as ^" not as the other disjunct! try switching order: /"(?:\\"|[^"])*"/g -- that does it ...except "\\" doesn't work as it should; the escape can't be escaped! /"(?:[^"\\]|\\.)*"/gs -- that does it better; \ won't match first class, now ((...)+)* can cause Perl to take a /very/ long time to determine that there is no match! try looking at an actual example: "........ \" ....... \\ ........" looks like: "[^\\"]* \\. [^\\"]* \\. [^\\"]*" so maybe: "[^\\"]*(?:\\.[^\\"]*)*" notice: no branching ends up about twice as fast as previous RE MRE calls this "loop unrolling" what is escape char /is/ delim char? like in SQL: DELETE foo WHERE bar='O''Reilly' easy -- but hard to transcribe fast enough basically replace the escape bits in the above RE what if delim is longer than one char? like html comments scary examples of stupid people's commenting pathologies, like this is a comment the secret to