Character Count

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

Post Reply
ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Character Count

Post by ARAS » Wed Nov 06, 2013 1:00 am

Hello,

I was trying to get the number of characters in a field. I tried doing that in two different ways. However, I got different results.

Does anybody have an idea why the result is different?

Regards,
Aras

Code: Select all

   local tString
   put field "Field" into tString
       set the caseSensitive to true
      repeat for each char tChar in tString
      if toLower(tChar) is tChar then
                     ## If we make the character lower case is it the same as the original character?
         add 1 to tLowercaseCount
      end if
      if toUpper(tChar) is tChar then
                     ## If we make the character upper case is it the same as the original character?
         add 1 to tUppercaseCount
      end if
   end repeat
   put  tLowercaseCount + tUppercaseCount into tCount
       answer "Lower case characters:" & tLowercaseCount && "Upper case characters:" & tUppercaseCount && "Total character count:" & tCount
Message Box Result

Code: Select all

Lower case characters:32615 Upper case characters:19640 Total character count:52255

Code: Select all

 local tStringUni
   put field "Field" into tStringUni
      repeat for each char tCharUni in tStringUni
         add 1 to tCountUni
      end repeat
      answer "tCountUni:" & tCountUni
Message Box Result

Code: Select all

tCountUni:34437

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am

Re: Character Count

Post by Simon » Wed Nov 06, 2013 1:30 am

Hi ARAS,
Is this unicode text again?
I saw from a previous post you are working with Arabic? Can you post a single short line that shows this problem?

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10331
Joined: Wed May 06, 2009 2:28 pm

Re: Character Count

Post by dunbarx » Wed Nov 06, 2013 3:35 am

Aras.

Did you intend to count spaces and other non-alphabetic chars? The functions "toLower" and "toUpper" have no effect on these, and they would be counted in each fork of the conditional. This might be the issue.

Of course, for your second sweep through, you could always: answer the length of field "yourField"

Craig Newman

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Character Count

Post by ARAS » Wed Nov 06, 2013 10:02 am

Simon wrote:Hi ARAS,
Is this unicode text again?
I saw from a previous post you are working with Arabic? Can you post a single short line that shows this problem?

Simon
Hi Simon,

It is in Turkish. We use Latin letters. According to meta tag, it says windows-1254. It is for Turkish Language.

This is an equivalent code for the source code I am using. I used words with Turkish characters.

Code: Select all

<html>

<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1254">
<meta http-equiv="Content-Language" content="tr">
<title>Kayıt</title>
<meta name="keywords" content="Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm">


</head>

<body>
Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm
</body>
In the text field, I get the codes just like below. (some weird characters)

Code: Select all

<html>


<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1254">
<meta http-equiv="Content-Language" content="tr">
<title>Kayıt</title>
<meta name="keywords" content="Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm">


</head>

<body>
Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm
</body>
ARAS
Last edited by ARAS on Wed Nov 06, 2013 10:15 am, edited 3 times in total.

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Character Count

Post by ARAS » Wed Nov 06, 2013 10:08 am

dunbarx wrote:Aras.

Did you intend to count spaces and other non-alphabetic chars? The functions "toLower" and "toUpper" have no effect on these, and they would be counted in each fork of the conditional. This might be the issue.

Of course, for your second sweep through, you could always: answer the length of field "yourField"

Craig Newman
Hi Craig,

I wanted to count everything including space. I understand you. You are right. I posted this in the middle of night. Actually what I meant is - why does toLower and toUpper code returns more number than each char in the field?

For example, for the html below, I get this result.

Code: Select all

<html>


<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1254">
<meta http-equiv="Content-Language" content="tr">
<title>Kayıt</title>
<meta name="keywords" content="Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm">


</head>

<body>
Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm
</body>

Code: Select all

Lower case characters:326 Upper case characters:129 Total character count:455
tCountUni:333
ARAS

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Character Count

Post by ARAS » Wed Nov 06, 2013 11:03 am

I think I know the reason why it gives more value than total character count.

It detects some characters both Upper and Lower. I don't know how but this is how I had this conclusion.

When I used two if statements, I got total of lower and upper case character numbers more than to the total of all character number.

Code: Select all

if toLower(tChar) is tChar then
                     ## If we make the character lower case is it the same as the original character?
         add 1 to tLowercaseCount
      end if
      if toUpper(tChar) is tChar then
                     ## If we make the character upper case is it the same as the original character?
         add 1 to tUppercaseCount
      end if
When I used else if statement, I got the total of lower and upper character numbers equal to the total of all character number.

Code: Select all

 if toLower(tChar) is tChar then
                     ## If we make the character lower case is it the same as the original character?
         add 1 to tLowercaseCount
      else if toUpper(tChar) is tChar then
                     ## If we make the character upper case is it the same as the original character?
         add 1 to tUppercaseCount
      end if
Basically, on the second code when it detects a lower case, it quits the else if statement. However, on the first code, for some characters, it goes through both if statements which creates more character number than the total character number in the field.

There is another question appears - why lower + upper is equal to the total? it suppose to be lower than the total because of space and maybe some other characters, doesn't it?

So Do toLower and toUpper work incorrectly sometimes? Any guesses?

ARAS

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10331
Joined: Wed May 06, 2009 2:28 pm

Re: Character Count

Post by dunbarx » Wed Nov 06, 2013 2:56 pm

Hi.

I think you are saying what I did. If you step through some random text that contains other than alpha chars, you will see both conditionals are met.

toUpper and toLower are only meant for alpha chars, those where uppercase and lowercase have meaning, A space has not, nor does a "3" or a comma.

Try adding a "next repeat" to each portion of the handler, so that if a toLower, for example, does indeed match, you do not then test again for toUpper:

Code: Select all

 put field "Field" into tString
       set the caseSensitive to true
      repeat for each char tChar in tString
      if toLower(tChar) is tChar then
                     ## If we make the character lower case is it the same as the original character?
         add 1 to tLowercaseCount
         next repeat
      end if
      if toUpper(tChar) is tChar then
                     ## If we make the character upper case is it the same as the original character?
         add 1 to tUppercaseCount
        next repeat
      end if
   end repeat
Or perhaps better, use a switch construct, which will explicitly separate the conditional instances.

Craig

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Character Count

Post by ARAS » Thu Nov 07, 2013 12:14 am

Hi Craig,

Thanks for your help.

I tried next repeat. It gave me the same result with else if statement. Is there any way to detect alpha characters? Maybe I can just test if it is an alpha character or not and then keep checking upper or lower. Even if there is a way to detect alpha characters, I think I will have some issues because of windows-1254.

ARAS

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10331
Joined: Wed May 06, 2009 2:28 pm

Re: Character Count

Post by dunbarx » Thu Nov 07, 2013 12:18 am

Hi.

Easy to detect alpha. Just see if the charToNum is between 65 and 90, and between 97 and 122. Or if the char is among "abcd..

Craig

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am

Re: Character Count

Post by Simon » Thu Nov 07, 2013 12:29 am

Or isNumber

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Character Count

Post by ARAS » Fri Nov 08, 2013 12:00 am

Thanks Simon,

I've learned sth new because of you :o isNumber

I have also added numbers into code. It is not perfect but better. If there is no way to detect punctuations and etc symbols, this code sounds okay.

Thanks everyone for the help.
Best wishes,
ARAS


Counter(numbers+uppers+lowers)

Code: Select all

local tString
   put field "Field" into tString
          set the caseSensitive to true
         repeat for each char tChar in tString
          if isNumber(tChar) then
              add 1 to tNumberCount
         else if toLower(tChar) is tChar then
            add 1 to tLowercaseCount
      else if toUpper(tChar) is tChar then
         add 1 to tUppercaseCount
end if
   end repeat
put  tNumberCount + tLowercaseCount + tUppercaseCount into tCount
    answer "Numbers:" & tNumberCount && "Lower case characters:" & tLowercaseCount && "Upper case characters:" & tUppercaseCount && "Total character count:" & tCount
Result

Code: Select all

Numbers:2089 Lower case characters:30526 Upper case characters:1822 Total character count:34437

Counter(all)

Code: Select all

local tString2
put field "Field" into tString2
repeat for each char tChar2 in tString2
   add 1 to tCount2
end repeat
answer "tCount2:" & tCount2
Result

Code: Select all

tCount2:34437

Post Reply