Word Counts

Discussion related to "Everything" 1.5 Alpha.
Post Reply
ChrisGreaves
Posts: 616
Joined: Wed Jan 05, 2022 9:29 pm

Word Counts

Post by ChrisGreaves »

This is not a problem in need of a solution.
I had cause to deliver a report on five articles Ive been commissioned to write.
EverythingWord2003VBA
554667682Diesel-Electrics
499497497Riding on the Footplate from Southern Cross to Ghooli and Back
330539559The Longest Wheat-Bin in the Southern Hemisphere
2,8512,7652,802Three Times a Year to and from School in Perth on the Kalgoorlie Express
4,2344,4684,5404,387
0.9651.0181.034
Four titles appear to the right.
I measured "word-count" in three ways:-
(1) By including the Word-Count in column labels in Everything: Right-click the column-heading bar, Document, Content, Word-count.
(2) By inspecting within Word2003 (File, Properties, Statistics)
(3) By a cute bit of VBA code that calculates readability statistics on a document.

The sums of word counts average out to 4,387 words and vary between 0.965 of the average and 1.034 of the average.

The VBA methods are always worth investigating, as are methods for calculating sentence-counts. If a sentence is defined as "terminated by a period", then “Dr.”, “Mr.”, “Mrs.” and "etc." affect the sentence count. Likewise for words: Are words delimited by spaces (usually ‘yes’), by hyphens (maybe not), and so on. The programmer makes a decision and lives with it.

I thought to post this in case anyone starts questioning Everything’s means of calculating Word-Count, or worse, stakes the company’s future on Word-Counts(grin!).
Cheers, Chris
horst.epp
Posts: 1384
Joined: Fri Apr 04, 2014 3:24 pm

Re: Word Counts

Post by horst.epp »

So the few people which are interested on Word counts
should select the best method to get a larger number :)
void
Developer
Posts: 15811
Joined: Fri Oct 16, 2009 11:31 pm

Re: Word Counts

Post by void »

Thank you for the issue report ChrisGreaves,

It's a bug with Office.

The word count is also incorrect under Right click -> Properties -> Details -> Word Count
(Everything is using the same word count value)

Please use the word count from Everything as a guide only.
ChrisGreaves
Posts: 616
Joined: Wed Jan 05, 2022 9:29 pm

Re: Word Counts

Post by ChrisGreaves »

void wrote: Thu Feb 29, 2024 4:49 am Thank you for the issue report ChrisGreaves, It's a bug with Office. The word count is also incorrect under Right click -> Properties -> Details -> Word Count. (Everything is using the same word count value) Please use the word count from Everything as a guide only.
Thank you David.
[pedant]Strictly speaking it's a bug in Office2003, and I, for one, have no plans to move on from Office2003 until they have fixed all the bugs :lol: [/pedant]

Regardless of who is wrong and who is right, my point is that different programming code can and will deliver slightly different word-counts (and other counts).
I think that no one is "right" and no one is "wrong". I would trust any device that returns a result that is consistent with other results.
Cheers, Chris
void
Developer
Posts: 15811
Joined: Fri Oct 16, 2009 11:31 pm

Re: Word Counts

Post by void »

The bug is present in Office 2013 and 2016 too.
Unsure if it is still present in Office 2019..

I will add some functionality to Everything to get the correct word count..
It will look something like: add-column:a a:=WORDCOUNT($content:)



You can find the correct word count in Microsoft Word from File -> Properties -> Statistics
Why the Office property handler doesn't use this value is beyond me..
void
Developer
Posts: 15811
Joined: Fri Oct 16, 2009 11:31 pm

Re: Word Counts

Post by void »

Everything 1.5.0.1370a adds a WORDCOUNT() formula function.

The following search will now work as expected:

ext:doc;docx add-column:a a-label:="Word Count" a:=WORDCOUNT($content:)

This word count column will show the number of words from the content.
It may differ slightly to what Word reports from File -> Properties -> Statistics (typically, it is the same -but may ignore comments and other hidden text)
It should be more accurate than the stock word count property.



WORDCOUNT(text) will return the number of words in the specified text.

A word is one or more alpha-numeric characters or punctuation.



Please note: $content: will load the entire file text content.
This may take a long time.
Post Reply