From 3f1ecb200cdf7cf0eff1272b110c622b4667d803 Mon Sep 17 00:00:00 2001 From: Peter Bull Date: Sat, 23 Apr 2016 15:59:58 -0400 Subject: [PATCH] README tweaks --- README.md | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 48443ec..28e8cc8 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,9 @@ -cookiecutter-data-science +Cookiecutter Data Science ------------------------- -An opinionated, but not-afraid-to-be-wrong project template for data science projects. Pull requests welcome. Debate encouraged. +An opinionated, but not-afraid-to-be-wrong project template for data science projects. +### [Project homepage](https://github.com/drivendata/cookiecutter-data-science) Requirements to create project: ----------- @@ -17,9 +18,3 @@ To start a new project: [![asciicast](https://asciinema.org/a/9bgl5qh17wlop4xyxu9n9wr02.png)](https://asciinema.org/a/9bgl5qh17wlop4xyxu9n9wr02) - -Data ----------- -** By default, the `data` folder is included in the `.gitignore` file.** If you have a small amount of data that rarely changes, you may want to include the data in the repository. Github currently warns if files are over 50MB and rejects files over 100MB. Some other options for storing large data include [AWS S3](https://aws.amazon.com/s3/) with a syncing tool (e.g., [`s3cmd`](http://s3tools.org/s3cmd)), [Git Large File Storage](https://git-lfs.github.com/), [Git Annex](https://git-annex.branchable.com/), and [dat](http://dat-data.com/). - -The prefered workflow if data is not in the repository is to have a make command `make data` that will download or create the relevant datasets.