Removing empty lines after removing HTML tags

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

Post Reply
Tribblehunter
Posts: 78
Joined: Wed Apr 10, 2013 9:08 pm

Removing empty lines after removing HTML tags

Post by Tribblehunter » Wed Oct 15, 2014 10:16 pm

Hi all.

Have successfully created my first couple of programs which has proven useful (and not just something for learning!).

Now need to finesse them and am stuck on an issue with parsing data from web site.

I have the source from the website in a field. I have removed the HTML tags ( which gives me the text and other things).

I am trying to remove all the blank lines so it is easier to work with the text.

Filter field with empty does not seem to do it.

I am guessing there are some 'hidden characters' which are left from the removal of the HTML.

Any pointers? I have searched the internet but can not seem to find exactly what is happening.
Returning to try to learn livecode again.

But much greyer at the temples than the last time.

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am

Re: Removing empty lines after removing HTML tags

Post by Simon » Wed Oct 15, 2014 10:25 pm

Hi Tribblehunter,
You should not be manipulating text in a field as this is a very slow process (processor intensive), dump the field into a variable and do all your work on it then stuff it back into the field.

See if filter without empty works for you then.

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

jiml
Posts: 339
Joined: Sat Dec 09, 2006 1:27 am

Re: Removing empty lines after removing HTML tags

Post by jiml » Thu Oct 16, 2014 11:46 pm

give this a shot

Code: Select all

filter someText without "*"

Tribblehunter
Posts: 78
Joined: Wed Apr 10, 2013 9:08 pm

Re: Removing empty lines after removing HTML tags

Post by Tribblehunter » Fri Oct 17, 2014 12:20 am

Thanks guys.

I will try the suggestions out.
Returning to try to learn livecode again.

But much greyer at the temples than the last time.

Tribblehunter
Posts: 78
Joined: Wed Apr 10, 2013 9:08 pm

Re: Removing empty lines after removing HTML tags

Post by Tribblehunter » Tue Oct 21, 2014 10:35 pm

Neither worked.

Managed to use regex "\s" to remove all spaces, but this left all text on one line.
Returning to try to learn livecode again.

But much greyer at the temples than the last time.

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am

Re: Removing empty lines after removing HTML tags

Post by Simon » Tue Oct 21, 2014 10:44 pm

Hi Tribblehunter,
Since it doesn't see the empty lines as empty, they could have something like tabs in them or one of the unprintable characters.
You should do a charToNum on the blank line to see what is there.

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

[-hh]
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2262
Joined: Thu Feb 28, 2013 11:52 pm

Re: Removing empty lines after removing HTML tags

Post by [-hh] » Tue Oct 21, 2014 11:16 pm

Hi all,

perhaps our problem is that we dont know whether you are working
= with the text from a field or
= with the htmltext from a field

For example setting the htmltext of a field to the content of a html file gives you the text of a file (and you don't have to remove html tags in case LC did this to your needs).

Then you could do, as Simon eventually said:

Code: Select all

-- a bit cumbersome, could be one regex line (forbidden in
-- the begiiners subforum). Here you can easily control the details:
put the text of fld "IN" into S
replace tab with space in S
repeat while space & space is in S
  replace space & space with space in S
end repeat
repeat while cr & space is in S
  replace cr & space with cr in S
end repeat
repeat while space & cr is in S
  replace space & cr with cr in S
end repeat
repeat while cr & cr is in S
  replace cr & cr with cr in S
end repeat
put S into fld "OUT"
This should remove nearly all unwanted whitespace from your field.
shiftLock happens

Tribblehunter
Posts: 78
Joined: Wed Apr 10, 2013 9:08 pm

Re: Removing empty lines after removing HTML tags

Post by Tribblehunter » Wed Oct 22, 2014 6:51 am

I understand more now! Thank you very much.

I will experiment this evening. Off to do normal job now!
Returning to try to learn livecode again.

But much greyer at the temples than the last time.

Tribblehunter
Posts: 78
Joined: Wed Apr 10, 2013 9:08 pm

Re: Removing empty lines after removing HTML tags

Post by Tribblehunter » Thu Oct 23, 2014 10:23 pm

Thanks for all the help.

With a bit of playing around it worked. Ish. But good enough for the testing phase. Little program for organising workshop manuals and checking internet for manuals via serial number entry.

My first proper program that is actually being used!! lol
Returning to try to learn livecode again.

But much greyer at the temples than the last time.

Post Reply