Page 1 of 1
Character Count
Posted: Wed Nov 06, 2013 1:00 am
by ARAS
Hello,
I was trying to get the number of characters in a field. I tried doing that in two different ways. However, I got different results.
Does anybody have an idea why the result is different?
Regards,
Aras
Code: Select all
local tString
put field "Field" into tString
set the caseSensitive to true
repeat for each char tChar in tString
if toLower(tChar) is tChar then
## If we make the character lower case is it the same as the original character?
add 1 to tLowercaseCount
end if
if toUpper(tChar) is tChar then
## If we make the character upper case is it the same as the original character?
add 1 to tUppercaseCount
end if
end repeat
put tLowercaseCount + tUppercaseCount into tCount
answer "Lower case characters:" & tLowercaseCount && "Upper case characters:" & tUppercaseCount && "Total character count:" & tCount
Message Box Result
Code: Select all
Lower case characters:32615 Upper case characters:19640 Total character count:52255
Code: Select all
local tStringUni
put field "Field" into tStringUni
repeat for each char tCharUni in tStringUni
add 1 to tCountUni
end repeat
answer "tCountUni:" & tCountUni
Message Box Result
Re: Character Count
Posted: Wed Nov 06, 2013 1:30 am
by Simon
Hi ARAS,
Is this unicode text again?
I saw from a previous post you are working with Arabic? Can you post a single short line that shows this problem?
Simon
Re: Character Count
Posted: Wed Nov 06, 2013 3:35 am
by dunbarx
Aras.
Did you intend to count spaces and other non-alphabetic chars? The functions "toLower" and "toUpper" have no effect on these, and they would be counted in each fork of the conditional. This might be the issue.
Of course, for your second sweep through, you could always: answer the length of field "yourField"
Craig Newman
Re: Character Count
Posted: Wed Nov 06, 2013 10:02 am
by ARAS
Simon wrote:Hi ARAS,
Is this unicode text again?
I saw from a previous post you are working with Arabic? Can you post a single short line that shows this problem?
Simon
Hi Simon,
It is in Turkish. We use Latin letters. According to meta tag, it says windows-1254. It is for Turkish Language.
This is an equivalent code for the source code I am using. I used words with Turkish characters.
Code: Select all
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1254">
<meta http-equiv="Content-Language" content="tr">
<title>Kayıt</title>
<meta name="keywords" content="Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm">
</head>
<body>
Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm
</body>
In the text field, I get the codes just like below. (some weird characters)
Code: Select all
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1254">
<meta http-equiv="Content-Language" content="tr">
<title>Kayıt</title>
<meta name="keywords" content="Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm">
</head>
<body>
Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm
</body>
ARAS
Re: Character Count
Posted: Wed Nov 06, 2013 10:08 am
by ARAS
dunbarx wrote:Aras.
Did you intend to count spaces and other non-alphabetic chars? The functions "toLower" and "toUpper" have no effect on these, and they would be counted in each fork of the conditional. This might be the issue.
Of course, for your second sweep through, you could always: answer the length of field "yourField"
Craig Newman
Hi Craig,
I wanted to count everything including space. I understand you. You are right. I posted this in the middle of night. Actually what I meant is - why does toLower and toUpper code returns more number than each char in the field?
For example, for the html below, I get this result.
Code: Select all
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1254">
<meta http-equiv="Content-Language" content="tr">
<title>Kayıt</title>
<meta name="keywords" content="Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm">
</head>
<body>
Kaşık, Çatal, İstanbul, Ördek, Öğretmen, Üzüm
</body>
Code: Select all
Lower case characters:326 Upper case characters:129 Total character count:455
tCountUni:333
ARAS
Re: Character Count
Posted: Wed Nov 06, 2013 11:03 am
by ARAS
I think I know the reason why it gives more value than total character count.
It detects some characters both Upper and Lower. I don't know how but this is how I had this conclusion.
When I used two if statements, I got total of lower and upper case character numbers more than to the total of all character number.
Code: Select all
if toLower(tChar) is tChar then
## If we make the character lower case is it the same as the original character?
add 1 to tLowercaseCount
end if
if toUpper(tChar) is tChar then
## If we make the character upper case is it the same as the original character?
add 1 to tUppercaseCount
end if
When I used else if statement, I got the total of lower and upper character numbers equal to the total of all character number.
Code: Select all
if toLower(tChar) is tChar then
## If we make the character lower case is it the same as the original character?
add 1 to tLowercaseCount
else if toUpper(tChar) is tChar then
## If we make the character upper case is it the same as the original character?
add 1 to tUppercaseCount
end if
Basically, on the second code when it detects a lower case, it quits the else if statement. However, on the first code, for some characters, it goes through both if statements which creates more character number than the total character number in the field.
There is another question appears - why lower + upper is equal to the total? it suppose to be lower than the total because of space and maybe some other characters, doesn't it?
So Do toLower and toUpper work incorrectly sometimes? Any guesses?
ARAS
Re: Character Count
Posted: Wed Nov 06, 2013 2:56 pm
by dunbarx
Hi.
I think you are saying what I did. If you step through some random text that contains other than alpha chars, you will see both conditionals are met.
toUpper and toLower are only meant for alpha chars, those where uppercase and lowercase have meaning, A space has not, nor does a "3" or a comma.
Try adding a "next repeat" to each portion of the handler, so that if a toLower, for example, does indeed match, you do not then test again for toUpper:
Code: Select all
put field "Field" into tString
set the caseSensitive to true
repeat for each char tChar in tString
if toLower(tChar) is tChar then
## If we make the character lower case is it the same as the original character?
add 1 to tLowercaseCount
next repeat
end if
if toUpper(tChar) is tChar then
## If we make the character upper case is it the same as the original character?
add 1 to tUppercaseCount
next repeat
end if
end repeat
Or perhaps better, use a switch construct, which will explicitly separate the conditional instances.
Craig
Re: Character Count
Posted: Thu Nov 07, 2013 12:14 am
by ARAS
Hi Craig,
Thanks for your help.
I tried next repeat. It gave me the same result with else if statement. Is there any way to detect alpha characters? Maybe I can just test if it is an alpha character or not and then keep checking upper or lower. Even if there is a way to detect alpha characters, I think I will have some issues because of windows-1254.
ARAS
Re: Character Count
Posted: Thu Nov 07, 2013 12:18 am
by dunbarx
Hi.
Easy to detect alpha. Just see if the charToNum is between 65 and 90, and between 97 and 122. Or if the char is among "abcd..
Craig
Re: Character Count
Posted: Thu Nov 07, 2013 12:29 am
by Simon
Or isNumber
Simon
Re: Character Count
Posted: Fri Nov 08, 2013 12:00 am
by ARAS
Thanks Simon,
I've learned sth new because of you

isNumber
I have also added numbers into code. It is not perfect but better. If there is no way to detect punctuations and etc symbols, this code sounds okay.
Thanks everyone for the help.
Best wishes,
ARAS
Counter(numbers+uppers+lowers)
Code: Select all
local tString
put field "Field" into tString
set the caseSensitive to true
repeat for each char tChar in tString
if isNumber(tChar) then
add 1 to tNumberCount
else if toLower(tChar) is tChar then
add 1 to tLowercaseCount
else if toUpper(tChar) is tChar then
add 1 to tUppercaseCount
end if
end repeat
put tNumberCount + tLowercaseCount + tUppercaseCount into tCount
answer "Numbers:" & tNumberCount && "Lower case characters:" & tLowercaseCount && "Upper case characters:" & tUppercaseCount && "Total character count:" & tCount
Result
Code: Select all
Numbers:2089 Lower case characters:30526 Upper case characters:1822 Total character count:34437
Counter(all)
Code: Select all
local tString2
put field "Field" into tString2
repeat for each char tChar2 in tString2
add 1 to tCount2
end repeat
answer "tCount2:" & tCount2
Result