The demand for analytics skills across all domains is growing exponentially. Text and data analysis is one of those skills, yet it remains difficult to learn. Researchers and students are often teased by black box, point-and-click tools that produce a few quick visualizations that whet the appetite; however, the next step in learning text analytics is a high one and requires students to learn statistics and programming.
Constellate is brought to you by the ITHAKA services, JSTOR and Portico. Constellate's primary goal is to make it easier for anyone to learn these text analytics skills by creating a learning platform that empowers faculty, librarians, and other instructors to educate a generation in text and data analysis. We provide users with the ability to build datasets for analysis from a variety of sources and provide a gathering space for the growing community of practitioners.
Our solution is centered on student and researcher success, providing text and data analysis capabilities and access to content from some of the world’s most respected databases in an open environment with a variety of teaching materials that can be used, modified, and shared.
Summary of the Platform
Constellate provides value to users in three core areas -- you can teach and learn text analytics, build datasets from across multiple content sources, and visualize and analyze datasets:
Learn & Teach
- Template and Tutorial Code: Work with template Jupyter Notebooks to analyze your dataset and learn about text analytics (with additional environments forthcoming, such as R Studio).
- Lessons and Documentation: Lessons and educational materials created by a community of experts, including those from the NEH-funded Text Analysis Pedagogy Institutes.
- Collaborative Teaching Materials Creation: Users may create, edit, reuse and collaborate in the creation of tutorials, code, documentation, and other educational resources for text analysis (our tutorial notebooks are all available in GitHub, in addition to being accessible for use in our Analytics Lab).
- Multiple Collections: Anchor collections from JSTOR and Portico, with additional content sources continually added (such as Library of Congress’ Chronicling America). Further details about the collections are available.
- Data Download in JSON
- All content - bibliographic metadata, unigrams, bigrams, trigrams
- Open content - bibliographic metadata, unigrams, bigrams, trigrams, full-text
- Dataset Dashboard: Easily view datasets you have built or accessed.
- Analytics Lab: Integrated computational environment powered by BinderHub that will allow users to seamlessly analyze text content using provided template Jupyter Notebooks and tutorials
- Visualize: Built-in visualizations for your datasets
- Work with Rights Restricted Full-Text: Access to substantial compute cycles to work with the full-text of rights restricted content (forthcoming in late 2021 -- until then, it is possible to request JSTOR content through a personal agreement)
Roll Out and Beta Evaluation
We are rolling out the subscription service by offering a six-month beta evaluation (January to June 2021, with the possibility of extensions) to institutions that participate in JSTOR or Portico. It is important to us that the platform be as widely available as possible, while also covering our costs, and to that end there will always be a tier of service available to individuals for free that improves on JSTOR’s self-service Data for Research (DfR) functionality (see our documentation about the differences between this new platform and DfR).
Institutional participants in the beta evaluation period will be able to provide their users with additional computational power in the Analytics Lab and participate in training sessions:
|Non-Trial Users||Beta Participants|
|Build & visualize datasets up to a specified number of items||25K||50K|
|Download datasets up to a specified number of items||25K||50K|
|View and download built-in visualizations for datasets||✔||✔|
|Access to computational environment resources sufficient for:||Learning||Teaching & research|
|Computational environment with learn to text mine notebooks||✔||✔|
|Compute environment - CPUs||<Core Tier||4 cores|
|Compute environment - maximum memory||2 GB||8 GB|
|Unlimited simultaneous users in computational environment||✖||✔|
|Adopt, adapt, and contribute tutorials and documentation||✔||✔|
|Run institutional users’ (instructors, students, etc.) repositories of code in our computational environment||✖||✔|
|Attend our Train-the-Trainer workshops||✖||✔|
|Attend our four week Learn Text Analytics Course||✖||✔|
This free, beta evaluation period is all about learning. It will help institutions gauge the demand on their campus for this tool and the effort to implement it. It will help us assess the amount of usage the platform may see to more accurately estimate costs and determine appropriate fees.
For our partner institutions in the beta evaluation, Constellate will run orientation sessions, text analytics workshops, be available for assistance and discussion, and frequently reach out to you. We ask that our partner libraries run workshops or otherwise introduce the platform to users, participate in the teaching and research community, and have frequent discussions with us.
Join the Beta
If you are interested in participating in our beta program, fill out this form to let us know and we will be in touch to set-up a video chat.
We intend to always offer a free tier of service that is suitable for individuals who need to build datasets of 25,000 items and do some work in our Analytics Lab. In addition to this free tier, in the second half of 2021, we expect to offer institutions subscriptions to a paid tier of service sized for teaching and learning. We do not yet have pricing for these subscriptions. We want to balance the need to both cover our costs and keep these subscriptions reasonably priced. The beta evaluation period associated with our 2021 launch will help us and our institutions evaluate both cost and value. By the end of 2021, we plan to offer an additional tier of service aimed at meeting the more substantial demands of advanced researchers requiring computing power and access to the full text of rights restricted content. If you are an advanced researcher interested in exploring with us what might meet your needs, please let us know at firstname.lastname@example.org.