Enron Email Dataset
- Last Update:December,24,2013 Created:December,24,2013
- Comment
- Like
- Favorite
Public
Profile
Title of the dataset | Enron Email Dataset |
---|---|
Provenance of the dataset | http://www.cs.cmu.edu/~enron/ |
How were the data collected/created? What was the cost? | This dataset was collected and prepared by the CALO Project, then was later purchased by Leslie Kaelbling at MIT, and turned out to have a number of integrity problems. A number of folks at SRI, notably Melinda Gervasio, worked hard to correct these problems. |
Data sharing policy | Other |
Data sharing policy |
About data analysis and simulation
Type of data: Check all that apply. Use "Other" to specify other types so that we can include them in further updates. | text |
---|---|
Variable labels of dataset (the names of the variables) | CONTENT (EMAIL)|FROM (EMAIL)|TO (EMAIL) |
Outline of data | This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation. |
Simulation process | |
Expected outcome of the process (obtained knowledge, analysis results, output of tools) | |
Anticipation for analyses/simulations other than the typical ones provided above |
Other
Comments | |
---|---|
What kind of data/tools do you wish to have? | |
Visualized information | |
Sample data |
Comment form