Page 1 of 1
Creating Parallel Texts
Posted: Sat Jan 14, 2012 12:43 am
by PoLyGLoT
Hi all,
I'm interested in using LiveCode to create bilingual, parallel texts that could be read in a PDF or Word Doc. I'm first trying to write code so that I can align the English text with it's foreign translation. I'd appreciate any help in this matter.
I'm attempting to first edit both texts to break them down by punctuation marks (such as putting "return" after each comma, period, or any other punctuation mark). Once both fields are edited, I'm looking for a way to merge them - such that two columns of text are created (one for English and the other for the second language).
What code can I use so that the program will keep "searching" until it finds a punctuation mark (upon which it will put return into the field)? Thanks for any help you can provide!
Re: Creating Parallel Texts
Posted: Sat Jan 14, 2012 5:44 am
by kdjanz
I assume that you are going to have two fields side by side on a card, with your raw English text read or pasted in on the left, and the raw (French) text on the right. Your first processing of the raw text is to chop it into phrases based on punctuation?
If so, you need to become good friends with the "replace" function. This example from the dictionary does the opposite of what you want- deleting paragraphs to make one big chunk of text:
Code: Select all
replace return with empty in field 1 -- runs lines together
This might be more what you would like to see:
Code: Select all
replace comma with comma & return in field 1
replace comma with comma & return in field 2 -- adds a line break after every comma
So depending on how many times you want to subdivide the text, you do a replace on each character you need to divide on, and your fields will get longer and longer.
If you want to get tricky, you could add some code to each field so that when you scrolled one field, the other would scroll as well so that they stayed sync'd in your reader.
Hope this gets you started,
Kelly
Re: Creating Parallel Texts
Posted: Sat Jan 14, 2012 6:02 pm
by PoLyGLoT
[quote="kdjanzGAFp8Q"]I assume that you are going to have two fields side by side on a card, with your raw English text read or pasted in on the left, and the raw (French) text on the right.
Yes, this is correct. I'd also like to add (eventually) some nice coloring and formatting, so that the columns correspond in color and length (and so on) to make for easy simultaneous viewing (that is the key).
"If you want to get tricky, you could add some code to each field so that when you scrolled one field, the other would scroll as well so that they stayed sync'd in your reader."
This would be perfect, actually! I've used the "replace" function, and I'm pretty much satisfied with how the text is being subdivided. One thing: I'm left with a field where some sentences have a space before starting the sentence, such as this:
"The xxxxxxx
That"
How can I remove all the empty spaces (as in the second line of text)?
Thanks sooooo much for all your help so far.
Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 12:35 am
by kdjanz
You could use the replace command again to search for all instances of 2 spaces " " and replace with 1 space " ". Depending on the text, you might want your code to do that 10 times, to get rid of all the double spaces. To get rid of the space at the beginning of the line, you could look at each line of code and delete the first char.
Code: Select all
*** untested pseudo code ***
repeat with tLine =1 to the number of lines in field 1
if char 1 of line tLine of field 1 is " " then put empty into char 1 of line tLine of field 1
end repeat
& do that to each field until things are cleaned up.
Kelly
Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 4:03 am
by sturgis
If you have text, each sentence on a seperate line, and have leading spaces for some of the lines you can probably replace return & space with return.
Obviously this won't work for the very first line if there is an issue there, but for the rest it will hopefully. You could also use replacetext to search for lines starting with space and remove the space that way but you have to know a little about regular expressions to do it that way.
Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 5:35 pm
by PoLyGLoT
Thanks for all the replies everyone.
The texts are now formatted and subdivided how I want them. Is there any graceful way to merge the two texts? It would be awesome if I could somehow write code so that the two fields merged together (with a long vertical line separating them). Even better would be if I could find a way to merge the two texts into a grid-like format (with each sentence inside its own grid, like in Excel) and then have Revolution color code each alternating sentence to match the first and second language.
So, for example, it would go:
L1 (English) | L2 (French) --> Both in color A
L1 (English) | L2 (French) --> Both in color B
L1 (English) | L2 (French) --> Both in color A
L1 (English) | L2 (French) --> Both in color B
Finally, it would be super duper awesome if I could either have it spit out a file in a different format (such as a PDF or something) OR find a way to do the simultaneous scrolling mentioned earlier (wherein scrolling one field automatically scrolls the other).
Thanks so much for everyone's help!!
Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 6:14 pm
by sturgis
This sounds like something the datagrid might be perfect for. (other than the printing to pdf, thats another story)
As a quick test I set up 2 fields with matching sentences, added a datagrid (form mode not grid mode) Set up the row template with "sentence1" and "sentence2" set up the behavior of the grid to work with the new template.. I didn't mess with the layout stuff which will probably be necessary to get the behavior you want, just commented it out for my quick test.
Then in a button added code that takes the lines of each field to create an array for the datagrid. Set the dgdata of the datagrid group the resulting array and viola'.
I'm not very good with datagrids so you'll want to check out lessons.runrev.com and read up there as well as considering the purchase of the datagrid helper (the slug does good work!) But this may get you to where you want to go. Alternatively, you can do as mentioned elsewhere and just have 2 fields seperated by a line, then when one field is scrolled, set the thumbposition of the second field to the thumbposition of the first. (maybe turn off the scrollbar for 1 field to make it simpler. Use the 2nd as the scroll control) If you have 'wrap' turned on in your fields and the line length differs enough between the 2 fields you might have to figure out how to keep the 2 in sync, but I haven't tried it that way so not sure what hoops you might need to jump through.
For the datagrid method i've set up a VERY simple example stack. You can find it at
http://dl.dropbox.com/u/11957935/sentences.livecode if you wish to try it.
Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 7:23 pm
by PoLyGLoT
Your program is very functional and effective. Thanks so much! I realized, however, that I still need to tinker to better align the texts. I did have one question in the meantime: How can I resize the length of the two columns in your program? The text is currently getting cut-off on both sides.
Again, thanks so much.

Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 7:45 pm
by sturgis
You'll want to mess with the behavior script and look for the "layout control" handler. I'm horrible at using datagrids so I didn't set up the controls defining how to adjust the template row. when the data is displayed.
To get a better handle on datagrids, go to lessons.runrev.com and scroll to the bottom. There is a BUNCH of information on the datagrid in the bottom 3 links. Also, as I mentioned the "datagrid helper" (can get it from the runrev store) is a pretty great tool for setting up datagrids.
Sorry i'm not much more help than this, I just have trouble remembering between times how to get things done with datagrids.
Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 9:00 pm
by PoLyGLoT
Thank you. I figured out most of the datagrid!
I have a basic question: I'm trying to edit the texts so that every time there is a quotation mark, the program inserts a quotation mark and return into the text. Furthermore, if there are quotation marks, I would prefer the program not to execute another line of code I have: " replace ". " with ". " & return in binL1"
That way, it will put a return every time there is either a period in the text OR a quotation mark, but only once if the text has both.
The problem I'm having with the whole quotation mark business is that you can't put "replace """ with """ and return" because it's invalid syntax.
Thanks for any help. You've all been great.
Re: Creating Parallel Texts
Posted: Sun Jan 15, 2012 9:26 pm
by sturgis
you can:
replace quote with quote & return in binL1
Can you give an example of some text with quotes, without, with quotes and a period, and explain where and how you want each part to be broken with a return? Are your quotes in pairs so that you have a start quote and end quote? Is it like dialong so that it appears as "so then I went to the store." so you want to break it at the ." ? Would be much easier to see what you're trying to accomplish with sample text.
Re: Creating Parallel Texts
Posted: Mon Jan 16, 2012 8:17 pm
by PoLyGLoT
Greetings again,
I must thank you for your continued help - it is truly awesome. I'd also like to elaborate a bit more on what I'm actually trying to do (along with provide you with the examples of text you requested).
Essentially, if it is at all possible, I'm attempting to create a revolution program which will automatically create a parallel (i.e. bilingual) text from two separate fields of text. So far, I've discovered that I must edit the two fields of text so that they are as similar as possible to each other before any possible merging is to take place. The best way to do that is probably to functionally separate each sentence so that it rests on it's own separate line. The current problem is that quotations (from spoken dialogue) are tripping up my editing process.
Here is an example of the text before any editing:
»Du schwatzt wie ein altes Weib, Piter«, erwiderte der Baron mit eiskalter Stimme.
»Weil ich glücklich bin, mein Baron. Während Sie... eifersüchtig sind.«
»Piter!«
»Aber Baron! Ist es nicht schade, daß Sie diesen Plan nur mit fremder Hilfe ausarbeiten
konnten?«
»Irgendwann werde ich dich erwürgen lassen, Piter.«
»Aber selbstverständlich, Baron. Enfin!«
»Stehst du unter Verite oder Semuta, Piter?«
//
This text is largely OK, although I'd prefer it to look like this:
»Du schwatzt wie ein altes Weib, Piter«
, erwiderte der Baron mit eiskalter Stimme. (<--- Separating the comma to a new line)
»Weil ich glücklich bin, mein Baron. Während Sie... eifersüchtig sind.«
»Piter!«
»Aber Baron! Ist es nicht schade, daß Sie diesen Plan nur mit fremder Hilfe ausarbeiten konnten?« (<--- moving up "konnten" to the previous line)
»Irgendwann werde ich dich erwürgen lassen, Piter.«
»Aber selbstverständlich, Baron. Enfin!«
»Stehst du unter Verite oder Semuta, Piter?«
//
The problem also, is that if I'm replacing all "." and "," with return and "." or return and ",", the text becomes very wonky when editing these dialogues. I'd prefer to have the dialogues ONLY be edited in terms of having the full sentence start with >> and end with << (NOT breaking up the intermediary periods and comma's interleaved within that block of text).
Also, check out another example of a longer dialogue (before editing):
»Wer die Wahrheit ohne Furcht ausspricht, verunsichert den Baron«, sagte Piter. Sein Gesicht
wurde zur Karikatur einer erstarrten Maske. »Oho, Baron! Sie sollten wissen, daß es ein
Mentat stets vorher weiß, wann der Henker zu ihm kommt. Sie werden sich meiner Dienste
bedienen, solange ich Ihnen von Nutzen bin. Mich früher umbringen zu lassen bedeutet
Vergeudung, und ich bin noch immer für viele Dinge gut. Ich weiß, was Sie von diesem
lieblichen Wüstenplaneten gelernt haben: Vergeude nichts! Richtig, Baron?«
Now what I'd like it to look like (after editing):
»Wer die Wahrheit ohne Furcht ausspricht, verunsichert den Baron«
, sagte Piter.
Sein Gesicht wurde zur Karikatur einer erstarrten Maske.
»Oho, Baron! Sie sollten wissen, daß es ein Mentat stets vorher weiß, wann der Henker zu ihm kommt. Sie werden sich meiner Dienste bedienen, solange ich Ihnen von Nutzen bin. Mich früher umbringen zu lassen bedeutet Vergeudung, und ich bin noch immer für viele Dinge gut. Ich weiß, was Sie von diesem lieblichen Wüstenplaneten gelernt haben:Vergeude nichts! Richtig, Baron?« (ALL on one line, wrapping of the text occurring.)
//
Ultimately, I should give you an example of the kind of parallel text I'm looking to create (which raises questions of whether this is possible with an automated program). Unfortunately, I cannot upload a pdf or .doc extension, so I'm unsure how to show you what I mean (or what kind of files I CAN upload, for that matter).
Thanks for all your help!!
Re: Creating Parallel Texts
Posted: Mon Jan 16, 2012 9:58 pm
by sturgis
Looks pretty complex. Sometimes breaking at comma, sometimes not, depending on if stuff is inside a quote set, or outside the set or if the person holds their mouth just so.
Are you going to supply all the pre-formatted and organized text? If so, then a method would probably be to break things up as best you can using script (IE break it into chunks on period, and whatever else you can automate to get "close" to your desired goal) After you do what you can with scripts, manual editing would come into play.
Question about your quotes. If your text really contains » to start and « to end quotes then to break out those sections you could
replace "»" with return & "»"
replace "«" with "«" & return
Do that before changing periods since it appears you don't want to break up quotes in the middle.
Then you can eat the text line by line
(repeat for each line tLine in theText)
check to see if the line starts with » and ends with « and if so, skip that line.
If not, then break at periods, commas etc. It might take multiple careful iterations and some code thats smarter than me to get perfect auto split. (doubt perfection is possible if its user supplied text) But if you can get CLOSE then your manual editing job would be much simpler.
I did something like your 'language' stack once. Kinda. Had 1 field that I could paste text into in spanish, then on mouseclick it would grab the mouseword and in another field, the translation would appear (using google translate) Worked ok I guess but didn't do much for teaching grammatical rules of the language. Sounds like what you're working on could be very helpful. (especially to a person like me who is horrid at language acquisition.)