File tree Expand file tree Collapse file tree 1 file changed +14
-4
lines changed
Expand file tree Collapse file tree 1 file changed +14
-4
lines changed Original file line number Diff line number Diff line change @@ -59,9 +59,8 @@ This dataset serves aims to be a resource for researchers focusing on AI-generat
5959<!-- GETTING STARTED -->
6060## Getting Started
6161
62- ### Composition
62+ ### 📌 Dataset Structure
6363
64- Here's a breakdown of the files in this dataset:
6564* 76,089 total files
6665* 58,524 files of original authors from the 2020 Google Code Jam
6766* 17,565 rewritten files using GPT-4o
@@ -76,13 +75,24 @@ Researchers can use this dataset to:
7675
7776<p align =" right " >(<a href =" #readme-top " >back to top</a >)</p >
7877
78+ ## 🔗 Citation
79+ If you use this dataset, please cite:
80+
81+ ``` bibtex
82+ @misc{P24_GCJ,
83+ author = {Paek, Timothy},
84+ title = {GPT Java GCJ Dataset: The Largest LLM-Generated Code Dataset from Google Code Jam},
85+ year = {2024},
86+ howpublished = {GitHub Repository},
87+ url = {https://github.com/tipaek/GPT-Java-GCJ-Dataset}
88+ }
89+ ```
90+
7991<!-- CONTACT -->
8092## Contact
8193
8294Timothy Paek - [ LinkedIn] ( https://www.linkedin.com/in/timothy-paek/ ) - tipaek@syr.edu
8395
84- Project Link: [ https://github.com/tipaek/GPT-Java-GCJ-Dataset ] ( https://github.com/tipaek/GPT-Java-GCJ-Dataset )
85-
8696<p align =" right " >(<a href =" #readme-top " >back to top</a >)</p >
8797
8898<!-- ACKNOWLEDGMENTS -->
You can’t perform that action at this time.
0 commit comments