Practicum: Tokenizing, lemmatization, frequency and correlation analysis In-Class Presentations: Student 5 and Student 6 Description: This session will focus on basic text processing techniques required as the basis for nearly all modes of textual analysis. Topics covered will include stemming, lemmatization, semantic reduction, naïve Bayesian classification, and word frequency analysis.
Practicum: Lexical Correlation and Lexical Variety In-Class Presentations: Student 7 and Student 8 Description: This unit focuses on basic modes of machine textual “reading” by analyzing of the words on the page and their relationships to each other. We will learn to perform various modes of machine reading and also
Practicum: Clustering and Topic Modeling In-Class Presentations: Student 9 and Student 10 Description: This unit focuses on modes of modeling textual content. We will learn how to build several types of models and discuss the math behind them. We will also investigate their use and misuse, and the impact of
Practicum: Visualization In-Class Presentations: Student 11 and Student 12 Description: This unit will focus on data visualization. Specific topics of discussion will be the advantages and limitations of visualization as a means of communication as and the methods for fitting the correct visualization to the correct dataset. The practicum will
During finals week we will meet in an informal setting (time, place, and location TBD based on everyone’s schedule). At this final meeting we will discuss each other’s prospectuses for future digital projects and any other topics or questions of interest to the group.