README: notice that this repo has moved to collective/cdsc_reddit
This commit is contained in:
@@ -2,6 +2,9 @@
|
|||||||
title: Utilities for Reddit Data Science
|
title: Utilities for Reddit Data Science
|
||||||
---
|
---
|
||||||
|
|
||||||
|
> **This repository has moved.** Active development is now at
|
||||||
|
> **<https://gitea.communitydata.science/collective/cdsc_reddit>**.
|
||||||
|
> This copy is archived and read-only—please use the new location.
|
||||||
|
|
||||||
The reddit_cdsc project contains tools for working with Reddit data. The project is designed for the hyak super computing system at The University of Washington. It consists of a set of python and bash scripts and uses the [Pyspark](https://spark.apache.org/docs/latest/api/python/index.html "Pyspark documentation") and [pyarrow](https://arrow.apache.org/docs/python/ "documentation of python arrow bindings") to process large datasets. As of November 1st 2020, the project is under active development by [Nate TeBlunthuis](https://wiki.communitydata.science/People#Nathan_TeBlunthuis_.28University_of_Washington.29 "Nate's profile on the Community Data Science Collective Wiki") and provides scripts for:
|
The reddit_cdsc project contains tools for working with Reddit data. The project is designed for the hyak super computing system at The University of Washington. It consists of a set of python and bash scripts and uses the [Pyspark](https://spark.apache.org/docs/latest/api/python/index.html "Pyspark documentation") and [pyarrow](https://arrow.apache.org/docs/python/ "documentation of python arrow bindings") to process large datasets. As of November 1st 2020, the project is under active development by [Nate TeBlunthuis](https://wiki.communitydata.science/People#Nathan_TeBlunthuis_.28University_of_Washington.29 "Nate's profile on the Community Data Science Collective Wiki") and provides scripts for:
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user