The task was to break up short texts into 3, 4 or 5 paragraphs. I manged that by counting full stops "." and determine the number of sentences that should make up a paragraph. However the texts could contain decimal numbers and the decimal delimiter is a ".". Furthermore abbreviations used "." as in "e.g." which I had to navigate around. Doing this in native Livecode would have been quite some coding and I figured that matchChunk using a regex expression could be helpful. Now regex is something I avoided since my needs up to now could be solved using livecode syntax.
Stam recommends using "Regex101" website and I followed his advice and found it a very sensible test ground for testing regex expressions.
I want to post here my solution to the problem of cleaning up a text by replacing non full stop dots temporarily and then reintroduce dots after formatting which is not shown here.
Since I did not find any working solutions for "matchChunk" I post my function here just as an example using matchChunk.
Code: Select all
function escapeAbbrAndDezimalpoints pText
local startMatch, endMatch, tStart, tBegin, tEnd, tVar, tRegex, tList
local tFrom, tTo
put pText into tVar
## this catches "e.g." and "2.7"
put "([a-zA-Z]\.[a-zA-Z]\.|[0-9]\.[0-9])" into tRegex
## define search range; tFrom will be changed if a hit is found, excluding part up to a hit
put 1 into tFrom
put the number of chars of tVar into tTo
repeat
if matchChunk(char tFrom to tTo of tVar, tRegex, tStart, tEnd) is true then
## built list of hits
put tStart + tFrom - 1, tEnd + tFrom - 1 & return after tList
## update search range
add tEnd to tFrom
if tFrom > tTo then
exit repeat
end if
else
exit repeat
end if
end repeat
if tList is empty then return tVar
## sort not necessary here but for cases where replace changes character count
sort tList descending numeric by item 1 of each
repeat for each line aLine in tList
put item 1 of aLine into tBegin
put item 2 of aLine into tEnd
replace "." with "±" in char tBegin to tEnd of tVar
end repeat
return tVar
end escapeAbbrAndDezimalpointsKind regards
Bernd
