Page Last Updated: Monday, 15 January 2024 16:08 EDT, 2022, 2023, 2024


Literary Analysis

Under Construction


The characters in the Persistence of Vision series invent a Hilbert space of literature characteristics, so it seemed appropriate to use that concept on my novels (and my scientific books).  Each dimension in this literary Hilbert space spans the values of a particular descriptive variable, such as Flesch-Kincaid Grade Level and action scenes per 5000 words of text.  The concept is that similar works should have similar dimensional-values - or directly explainable differences, such as the number of words in a short story versus the number in a novel. 

The entire Hilbert space can be contained in a SAS JMP file; however, visualizing dozens of dimensions is difficult.  However, JMP provides a 3-D tool that allows for rotation of the visualization.  I've provided three salient views that I rotated so that their 2-D renderings provide some information.


The size metrics include page count, word count, number of chapters, and number of figures.

Overall, the novels average 350 pages, with the Dark Energy series being slightly longer, averaging 360 pages, and the Sense of Gravity series being shorter at 330 pages.  The word count average is 112,000 and is fairly consistent across the series, with a standard deviation of 12,000 words.  The number of chapters range from 18 to 26, with an average of 22.4.  The number of figures has the largest variation, ranging from 1 to 20 per novel; however, most have 6-8 figures.


The content variables are science, religion, sexy scenes, action, travel, business, and figures.  The occurences and intensity of each of these are divided by the word count for each novel and converted to per-5000-word metrics.

The first scatterplot illustrates the basic concept with four groups of books.  To the lower right are my scientific books, as expected, they show some variation in the Science per 5000K words axis, but are all close to zero on the sexy-scene and religion axes.  The second group, at the top of the figure are the religious books.  They are high on the religious axis and low on the other two axes.  I invented a third group to span the space, which is very high on the sexy axis, but low on the other two axes.  In the middle are the science fiction novels.

full diagram

The second scatterplot is the same as the first, but with the axes restricted to the values exhibited by the novels.  It shows that the novels have some spread in this Hilbert space but that there are some commonalities with each of the four series that are shown.


The data are color-coded by series.  The Dark Energy series shows more spread in both the sexy scenes and the religion dimensions; however, the average values for each series are close in all three dimensions.

The third scatterplot compares the values in the action, travel, and business dimensions.  There are major differences in all three dimensions; however, again, the averages for each series are close together.

Content 2

The final content dimension, figures per 5000 words, is not displayed.  The Dark Energy and Sense of Gravity series contain books with significantly more figures than their series averages and more than in the Persistence of Vision series.


The style variables are dialog per 5000 words, reading grade level, and passive sentence fraction.  As above, the data are coded by series in the scatterplot.  The Dark Energy novels are written at a slightly higher grade level, 7.5, than the other two series, 6.7, 6.4 and 6.3.  The Persistence of Vision series, on average, has more dialog and the Dark Energy series has less, while the reverse is true with regard to passive sentences.  However, overall, the averages in all three variables are close.  Individually, the novels show that larger dialog values are correlated with lower grade levels and a smaller proportion of passive sentences; whereas, lower dialog values go with higher grade levels and more passive sentences.



In some dimensions, there are differences among the series; however, they are generally similar.  The 3-D scatterplots have a reference in the legend to the Springer-published scientific books; however, they aren't shown in most of the views, as they are widely separated from the science fiction data points.  Thus, the Hilbert space concept can produce the desired differentiation among literary works, clumping scientific works in one volume of the space and science fiction in another.  Further differentiation within these novels would be possible.  For example, separating the different sciences that are used would separate the series and some intra-series separation.  (Anthropology is strongly used in the latter three novels of the Dark Energy series, but not in the first three - or in the next two series, for that matter.  But anthropology is used in the fourth series.)  Similarly, identifying the science-fictional theme would allow differentiation - and, if combined with analyses of other SF works - would allow for clumping along that set of dimensions.  I also experimented with using the date of writing each novel to see if there were any significant changes as I matured as a writer.  I didn't find anything dramatic, so none of that is displayed here.

This analysis serves no purpose other than being fun.  However, there are people who do literary scholarship in SF and they might find the technique to be useful.


Return to Dean Hartley Science Fiction

Return to Dr. Dean S. Hartley III Entrance.