key: cord-0459698-lm1j1wk0
authors: Biderman, Stella; Raff, Edward
title: Neural Language Models are Effective Plagiarists
date: 2022-01-19
journal: nan
DOI: nan
sha: f051998a089d62a598813ccbe937c0f387e1c32b
doc_id: 459698
cord_uid: lm1j1wk0

As artificial intelligence (AI) technologies become increasingly powerful and prominent in society, their misuse is a growing concern. In educational settings, AI technologies could be used by students to cheat on assignments and exams. In this paper we explore whether transformers can be used to solve introductory level programming assignments while bypassing commonly used AI tools to detect plagiarism. We find that a student using GPT-J [Wang and Komatsuzaki, 2021] can complete introductory level programming assignments without triggering suspicion from MOSS [Aiken, 2000], a widely used plagiarism detection tool. This holds despite the fact that GPT-J was not trained on the problems in question and is not provided with any examples to work from. We further find that the code written by GPT-J is diverse in structure, lacking any particular tells that future plagiarism detection techniques may use to try to identify algorithmically generated code. We conclude with a discussion of the ethical and educational implications of large language models and directions for future research.

The COVID-19 pandemic has lead to a boom in online education [Affouneh et al., 2020] due to the health risks of inperson teaching [Juliana et al., 2020] . This has in turn lead to a growing market for tools that claim to ensure academic integrity in online courses [Yasid et al., 2020] . With the increased attention to and wider use of these tools, it becomes even more important to have thorough understanding of if, and how, such tools may be subverted by students. While there is a substantial literature on plagiarism techniques and detection strategies, this literature has not yet engaged with the recent breakthroughs in transformers research.

Recent work in the transformers literature has investigated the ability of large language models to solve college-level homework assignments in a variety of domains including calculus [Drori et al., 2021] , linear algebra [Drori and Verma, 2021] , astronomy [Shporer et al., 2021] , machine learning [Tran et al., 2021] but has not engaged with the issues and implications of real-world application of their results.

In this paper we carry out the first investigation of the ability of a transformer-based language model to generate de novo completions of homework assignments in a college-level course while bypassing plagiarism detection techniques. For our investigation we choose to use Measure of Software Similarity (MOSS) [Aiken, 2000] , a popular tool used to monitor academic integrity in universities worldwide [Bowyer and Hall, 1999; Lancaster and Culwin, 2004; Luke et al., 2014; Sheahen and Joyner, 2016] , to identify likely plagiarisms. We choose to study MOSS because it is an opensource model representing the state-of-the-art for plagiarism detection and its design has been widely copied by commercial tools [Devore-McDonald and Berger, 2020] . We note the licensing of commercial tools prevent us from studying them directly. We find that the GPT-J model [Wang and Komatsuzaki, 2021] is capable of fooling MOSS [Aiken, 2000] with a high degree of reliability, that little human editing is required to produce assignments that would receive full marks in a course, and that GPT-J solutions have sufficient diversity to avoid easy detection. Our study raises serious questions about the efficacy and ethicality of using MOSS-based tools as a sole or primary identifier of academic plagiarism 1 .

The rest of our paper is organized as follows. First we will review the related work, and how prior language models have not provided satisfying answers to the plagiarism risk of language models in section 2. Next we will review our methodology in section 3, where we provide a realistic approach to how a student may use GPT-J to plagiarize an assignment, and what (simple) prompt engineering is required to produce the results. section 4 performs a detailed analysis of the results, cataloging the types of errors made by GPT-J, measures the similarity and detection risk of multiple plagiarism strategies, and shows how GPT-J is challenging to detect. Further detailed analysis of individual assignments is in the appendix. Due to the nature of this work, we discuss the ethics and why we believe the benefits outweigh their risks in section 5, noting that the limits of GPT-J's abilities prevent it from allowing a student to proceed through an entire degree, and other teaching touch points like exams and quizzes provide a remediation for the informed instructor. Finally we conclude with a discussion and related Human-Computer Interaction (HCI) problems in section 6.

Very recent concurrent works have looked at college assignment generation [Drori et al., 2021; Shporer et al., 2021; Tran et al., 2021] as a means of understanding what language models like GPT-3 [Brown et al., 2020] and Codex [Chen et al., 2021a] learn. Not only are these models an order of magnitude larger than GPT-J, the study setups do not evaluate a realistic scenario for plagiarism. Significant coding, prompt engineering, and infrastructure is required for the experiments that a novice student would not have or be able to replicate (indeed if they could, there is no question they could complete these introductory assignments without assistance). Additionally , Our work instead focus on a realistic threat model to determine that there is a real risk that needs further study, though we suspect many of the best solutions would be non-technical (See section 5).

Previous research has used language models to detect textual plagiarism with modest success in a laboratory setting [Chong et al., 2010; Foltỳnek et al., 2020] . While we were unable to find papers that explicitly seek to use language models for plagiarism, there is a wealth of research both on paraphrasing [Narayan et al., 2018; Liu and Lapata, 2019; Li et al., 2020] and on adversarial examples against textual models [Krishna et al., 2020; He et al., 2021; Darmetko, 2021] , both of which can easily be applied to plagiarism even if the papers do not say so explicitly. A key difference in these prior works is that they all require an initial valid solution that becomes plagiarized, i.e., modified to avoid detection. In our study we show that a modern neural language models can produce valid, or valid with little additional work, solutions to novel questions that have no given solution. The user is then plagiarizing the model itself, rather than a human being. To the best of our knowledge this is the first demonstration of such an ability, and poses new concerns on how to avoid such potential plagiarism.

There is comparatively less research on AI-based code plagiarism, likely because AI techniques for generating code have only recently become prominent [Wang and Komatsuzaki, 2021; Hendrycks et al., 2021; Chen et al., 2021a; Austin et al., 2021; Mukherjee et al., 2021] . Dawson et al. [2020] is the most relevant that we are aware of, who use machine learning to compare solutions of prior submissions of a student to detect "contract" plagiarism, where assignments are outsourced to third parties, which results in inconsistent coding style of the student's submissions. While Dawson et al. [2020] found promising evidence of machine learning being helpful in this case, there is also risk of neural style transfer [Hu et al., 2020] being leveraged to combat this. The resulting adversarial game is beyond the scope of this work, but the need for greater study is reinforced by our findings.

To evaluate the ability of GPT-J to fool MOSS, we compare how MOSS views code generated by GPT-J with a dataset of plagiarized introductory coding assignments created by Karnalim et al. [2019] . The dataset contains seven programming exercises suitable for an introduction to programming course with accompanying solutions written in Java. For each solution (henceforth referred to as "original solution"), Teaching Assistants (TAs) who have experience in introductory computer science courses composed independent solutions to the exercise (henceforth "non-plagiarisms"). These represents two distinct sources of valid solutions to the exercise, with one that may be 'altered' to produce plagiarisms, and the other that can be used to measure similarity (i.e., similarity between source and plagiarized variant, and between plagiarized variant and an independent solution).

Around fifty plagiarisms of the original solutions are made, designed using techniques from and classified according to, the taxonomy presented in Faidhi and Robinson [1987] . Following [Karnalim et al., 2019] , we refer to these plagiarism categories as "levels" with "level 1" being the simplest and "level 6" being the most sophisticated form of plagiarism. While the exact number of plagiarized solutions varies slightly, every exercise has between seven and nine examples of plagiarisms per category and examples of plagiarisms from all seven categories in the taxonomy. In total, each programming exercise is paired with 15 non-plagiarisms and between 49 and 54 plagiarisms of varying types.

A comparison of MOSS scores by plagiarism level is shown in Figure 1 , with a higher similarity score indicating that an assignment is more likely to be plagiarized. Faidhi and Robinson [1987] and Karnalim et al. [2019] find that level 1 through 3 plagiarisms can be easily detected by human graders, that level 3 plagiarisms are borderline, and that level 6 plagiarisms can consistently fool human graders. These claims are consistent with the fact that level 4, level 5, and level 6 plagiarisms obtain similar MOSS scores to the genuinely not-plagiarized solutions.

We prompt GPT-J with the descriptions of the exercises provided in [Karnalim et al., 2019] and attempt to use it to produce code that compiles and correctly solves the problem. To evaluate our success, we have three major criteria:

1. We wish to obtain code that correctly solves the exercise. 2. We wish to obtain code that is not flagged by MOSS as suspiciously similar to the original code. 3. We wish to minimize the amount of human modification necessary to obtain code from raw GPT-J output. As we will show later in section 4, these desiderata can be obtained with relative ease despite GPT-J not knowing about MOSS, and can generally be completed with just a querying GPT-J multiple times.

The problems that we use to prompt GPT-J are shown below, organized loosely in increasing order of difficulty. These descriptions are taken verbatim from [Karnalim et al., 2019] , and reflect the formatting and stylistic choices made by the authors. [2019] by how they were generated. "Original" denotes the original solutions, "LN" denotes Level N plagiarism, and "non" denotes non-plagiarisms.

1. Write a program that prints "Welcome to Java" five times. 2. Write a program that accepts the radius & length of a cylinder and prints the area & volume of that cylinder. All inputs and outputs are real numbers. 3. Write a program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, normal, overweight, or obese. A person is underweight if BM I < 18.5; normal if 18.5 ≤ BM I < 25; overweight if 25 ≤ BM I < 35; or obese if BM I ≥ 35. Height = f eet * 12 + inches BM I = weight * 0.45359237/(height * 0.0254) 2 4. Write a program that shows a conversion table from miles to kilometers where one mile is equivalent to 1.609 kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation. 5. Write a program that accepts an integer and displays that integer with its digits shown in reverse. You should create and use a method void reverse(int number) which will show the reversed-digit form of the parameterised number. 6. Write a program that accepts 10 integers and shows them in reversed order. 7. Write a program that accepts a 4 × 4 matrix of real numbers and prints the total of all numbers placed on the leading diagonal of the matrix. You should create and use a method double sumMajorDiagonal(double[][] m) which will return the total of all numbers placed on the leading diagonal of the parameterised matrix.

As GPT-J was trained on mathematics and computer science text that is written in L A T E X [Gao et al., 2020] , we choose to provide the model with an input prompt written in L A T E X. This has the notable advantage of allowing us to provide a prompt that is identical to what is presented in Karnalim et al. [2019] , and is hopefully reflective of how a real student might opt to type a homework assignment into a language model.

There is a substantial literature on "prompt programming," or crafting inputs to language models in ways that help the model perform better at downstream tasks [Reynolds and Mc-Donell, 2021; Li and Liang, 2021; Chen et al., 2021b] (for a survey, see [Liu et al., 2021] ). Although we expect students to try varying phrasing and framing to elicit improved results, we leave the impact of this to future work that delves into an important computer-human interaction problem beyond the scope of this article. The only adjustment that we make to the exercises presented in Karnalim et al. [2019] is that we add the word "Java," to make each exercise begin "Write a Java program..." A cursory exploration of GPT-J without this modification shows that it is not inclined to write in Java without specific prompting, often giving results in Python and C instead. Once the prompts are modified to include the word "Java," all responses are either natural text, L A T E X, or Java except for some of the responses to exercise 4 which were in C.

Not code C Java Python "Wrote a program that. . . " 25 6 4 0 "Write a C program that. . . " 22 12 1 0 "Write a Java program that. . . " 19 3 13 0 "Write a Python program that. . . " 22 1 0 12

Figure 2: Natural language prompts are pretty hit-or-miss when it comes to getting GPT-J to produce an actual program, but it is responsive to naming specific programming languages. Table shows the results of 5 generations for each of the 7 programming tasks

We use the free GPT-J API offered by EleutherAI 2 . The API offers two options for tuning generations called "top-p" and "temperature." For the purposes of this paper we stick with the default values of 0.9 and 0.8 respectively, and leave exploring the influence of these parameters for future work. The API also provides a "send results as prompt" button, which concatenates the prompt with the generated response and resubmits the combined string as a new prompt and allows the model to continue from where it left off. This is especially helpful for circumstances where the model generation halts before completing a piece of code. Although autoregressive language models can generate indefinitely, for cost reasons the online demo has a limited number of tokens that it will return. For this paper, we resend the results as a prompt for a new generation until we appear to reach a complete program or until five consecutive generations yield no code, whichever happens first.

One particularly noteworthy aspect of generating code is that a student using GPT-J to cheat most likely is able to test whether or not the resulting code solves the problem, even at low skill levels, since compiling and running a program are first steps in curricula. Additionally, they are able to receive hints as to why their code doesn't work by running it and examining the errors that are printed out. The nature of a concrete and objective output allows unambiguous feedback to the student before receiving a grade. In this context, to produce a more realistic threat model, we allow the human using the AI to cheat to modify the outputs lightly. Specifically, we add whatever imports are necessary 3 and lightly edit the code by correcting simple syntax errors, off-by-one errors, and removing "obvious" mistakes. Fixing these types of errors is likely to occur due to the objective nature of grading and assignment completion, and in doing so we document and categorize the types of errors we observe across our results.

While what is considered obvious varies from person to person, we have sought to take a very conservative approach in the hopes of understating GPT-J's capabilities and representing a realistic use-case by a minimally knowledgeable student. All generated text and human-edited final solutions can be found in the supplemental materials.

We note that this is in contrast to using GPT-J to cheat on a creative writing assignment, write mathematics proofs, or solve science homework assignments where a students' ability to evaluate and improve the quality of the generated writing is connected to their subjective ability to complete the homework assignment without assistance, and to understand the preferred or desired intent and style of the instructor. The feedback on correctness does not occur until after it is too late to "refine" the plagiarism. For this reason we do not explore this more subjective-to-evaluate area of plagiarism.

In this section we give a high level overview of the results of our experiments. GPT-J generates correct solutions with minimal intervention For six of the seven programming exercises, GPT-J can produce a complete solution that requires no editing. In most cases GPT-J makes minor mistakes that need correction by a human, but which overwhelmingly do not require any knowledge of computer programming besides the ability to run code and search the internet for the error code reported by the compiler. Memorization does not explain GPT-J's performance Transformers, like all neural networks, exhibit a behavior commonly referred to as "memorization" where long passages of the training data are regurgitated word-for-word [Carlini et al., 2019; Feldman, 2020; Carlini et al., 2021] . We explore the possibility that the exercises and their solutions are memorized by searching through the training data in GPT-J for exact matches of 20 tokens or more. We find 3 missing imports are reported by the Java compiler and can be easily found via searching the web Problem Number

Errors 1 2 3 4 5 6 7 Tot.

No Errors 2 0 2 0 1 4 2 11 Syntax Error 1 6 3 0 0 0 0 10 Off by One Error 0 0 0 0 0 1 0 1 Misc. 5 0 3 0 0 3 1 12 Total Correct 8 6 5 0 1 8 3 31 Table 1 : Minor errors made by GPT-J in problems that were ultimately judged to be "correct." Note that the columns do not add up to "Total Correct" because some solutions have multiple errors.

Correctness 1 2 3 4 5 6 7 Tot.

Correct 8 6 5 0 1 8 3 31 Partial Completion 1 1 0 1 0 2 0 5 Wrong 0 0 1 1 4 3 0 9

Actually in C 0 0 0 2 1 1 0 4 Not Code 6 8 9 11 9 1 12 56 [2019] avoids being in the training data by having their preprint posted on the University of Warwick website rather than arXiv. Crucially, this means that generating novel questions is not an effective way for professors to prevent their students from using GPT-J to cheat. GPT-J does not register as plagiarizing by MOSS Out of the correct completions, none of the code generated by GPT-J stood out as potential plagiarisms according to MOSS. GPT-J's exercise completions rated similarly to the non-plagiarism and to the plagiarisms created using the most advanced techniques in our dataset, as shown in Table 3 and Figure 3 . There is no easy way to detect GPT-J's output Although MOSS is unable to detect GPT-J's plagiarisms when they are Figure 3 : A breakdown of MOSS scores for completions of exercises by how they were generated. This plot is the same as Figure 1 , except it adds an "AI" column denoting GPT-J's code. A higher score indicates more code similarity with the reference solution.

mixed in with genuine solutions, one might think that having examples of GPT-J's solutions would allow an instructor to detect other outputs from GPT-J. We find that this is not the case in two senses: MOSS does not judge GPT-J's solutions to be notably similar to each other, and a clustering algorithm trained to distinguish between GPT-J's solutions and those in the dataset fails to do so. While this does not rule out the possibility of more advanced techniques detecting GPT-J's fingerprints, the techniques currently used for plagiarism detection in practice fail to do so. To further demonstrate the syntactic diversity of the GPT-J solutions, we perform 2D embeddings of the MOSS scores using the Isomap algorithm. Isomap is preferred in this instance because it: 1) Embeds the data based on a geodesic assumption of the manifold that we know to be true due to the multiple submissions per homework and nature of source to plagiarized copy. 2) Preserves global and local relationships in the embedding, which we care about to understand the degree of separation between solutions (as measured by MOSS). Other popular approaches such as t-SNE and UMAP lack this second property that we desire, and make interpretation of the results difficult.

The Isoamp results are found in Figure 4 , where we see the GPT-J solutions exist predominantly in their own independent area, or near other original solutions (green). Each shape indicates a different homework question the solution is designed for, and the overall distribution reinforces the geodesic assumption (same assignment solutions cluster, plagiarisms cluster near their source (black), and better plagiarisms (yellow) are further away than ineffective ones (red) 4 .

We note that the disperse of the GPT-J solutions in "their own space" does not imply that they would be easy to detect. 4 Yellow plagiarisms are "level 6" in the taxonomy of Faidhi and Robinson [1987] . Both our experiments and those of Faidhi and Robinson [1987] indicate that level 6 plagiarisms are effectively indistinguishable from non-plagiarized assignment completions. , plagiarized variants (red and yellow, where yellow are sophisticated ones that evade MOSS), along with GPT-J produced solutions (blue) and independent implementations (green). The blue data points being interspersed with yellow, green, and far-away spaces shows that GPT-J is not easily detected by MOSS.

The green independent solutions are not a complete sampling of the space, and often exist in similar areas, so the validity of a model built from this data, which represents one course, is unlikely to generalize due to strong violation of I.I.D. nature of the data (e.g., everyone had the same professor, TAs, study groups, etc.). We are aware of no longitudinal data of plagiarized and independent assignments, but our results suffice to show that this is a novel problem with no trivial solutions.

This paper details a free and easily accessible way to carry out academic misconduct without being detected using common approaches. While we anticipate that the results of this paper will be concerning to many, it is important to recognize that these flaws and risks exist. In particular, we do not have evidence that GPT-J is sufficient for prolonged plagiarism into more advanced courses, which limits the degree of impact (students who reach such courses are likely to fail out or revert to some other form of cheating), and provides no means for students to circumvent in-person quizzes, exams, and other assignments that may be used to determine possible cheaters (e.g., in our experience we follow up and carefully review students who have high grades in one aspect of the course and low in the other. Both to identify potential misconduct, but also students who may have special needs that have not been satisfied).

We believe these additional factors modulate the risk of publishing this research, and the value in studying it is ultimately beneficial. Indeed the use of the publicly available and inspectable GPT-J was a requirement for us to confirm that the generated solutions are novel and not memorized content regurgitated. Further remediation of risk may indeed be possible by incorporating GPT-J into a modernized curriculum. For example, one could envision a course assignment to generate multiple solutions using GPT-J and have students rank them based on code readability, identify / describe types of errors produced, and other code review/debugging tasks. This would enable growth in skill, and can be used to establish boundaries of when and how GPT-J (or future variants) should be used in an academic context. Ultimately this is a significant item for future work.

While our results present a promising (from a research perspective, see aforementioned ethics section) analysis of the use of GPT-J to plagiarize coding assignments, there are still significant limitations and important avenues for future work. Importantly many aspects of our work's success pose long term challenges and new research questions.

In this paper we speak of MOSS as a tool for detecting plagiarism because our focus is on the use of text similarity analysis to detect cheating in academic environments. However, while this is the intended use of MOSS, it is not the only one. As noted previously there are many tools that appear to be suitable for use to detect plagiarism but which are primarily marketed for other purposes. In future work we intended to examine how the intended application is explicitly and implicitly embedded in the functionality of document similarity detectors [Johnson, 2022; Birhane et al., 2021] , and incorporate analysis of what it means for such a device to be judged as "working" in its planned application context.

In this paper we consider the example of a student with minimal knowledge of computer programming. We allow them to make minimal edits, but the user is not modeled as performing significant intellectual work. Recent research on human-AI co-creative systems has shown that a human working together with an AI can solve problems that neither can solve individually.

Our study has several technical limitations, most notably a small number of GPT-J generations and a lack of experimentation with prompt programming. Future more extensive experiments with GPT-J may reveal a more complete picture than what we find.

Despite their popularity in Natural Language Processing, there has been very little work on transformers from a humancomputer interaction point of view. In particular we were unable to identify any literature investigating how people go about using and interacting with language models, and consequentially need to base some of our experimental methodology on conjecture and common sense. We hope that research into how people use transformers will empower future researchers to iterate upon our experimental design with more realistic assumptions about student behavior and model usage. GPT-J was trained using a mix of raw text, source code, and formatted text. This gives it a tendency to use line-breaks inconsistently, sometimes using single-spacing, sometimes using double spacing, and sometimes typing out the newline character n. Similar phenomena happen with tabs and indentation.

Model generated text in this appendix has had its indentation and line breaks adjusted for consistency, space concerns, and improved human readability. Although this would have an impact on proper code execution in some languages, all line breaks and indentations in Java are ornamental. Other than these readability fixes, all text presented in teletype is a verbatim quote of the model output. By contrast, code presented in colorized codeblocks have been lightly edited as discussed in section 3.3.

We begin by presenting representative solutions to each exercise. For complete model outputs on all generations, see Appendix D.

Exercise 1: Write a program that prints "Welcome to Java" five times. System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); } } Exercise 2: Write a program that accepts the radius & length of a cylinder and prints the area & volume of that cylinder. All inputs and outputs are real numbers. double area = Cylinder.area(r, l); double volume = Cylinder.volume(r, l); System.out.println("

The area of cylinder is " + area); System.out.println("

The volume of cylinder is " + volume); } static double area(double r, double l) { double area = 2 * r * r * Math.PI + 2 * r * Math.PI * l;; return area; } static double volume(double r, double l) { double volume = l*Math.PI*Math.pow(r, 2); return volume; } } Exercise 3: Write a program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, normal, overweight, or obese. System.out.println("Obese"); } } } Exercise 4: Write a program that shows a conversion table from miles to kilometers where one mile is equivalent to 1.609 kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

This task was not successfully completed by GPT-J in 15 trials. The closest it got was generating a conversion calculator, but writing code that generates a table seems to have stymied it. Exercise 5: Write a program that accepts an integer and displays that integer with its digits shown in reverse. You should create and use a method void reverse(int number) which will show the reversed-digit form of the parameterised number. 

Although GPT-J reliably constructs the general logic of an exercise, human intervention is often required to correct typos, amend the solution to the problem specification, or on rare occasions identify the correct solution from several options.

GPT-J was trained on Stack Overflow, and this is clearly reflected in some of its generations which directly mimic the style of the Q&A site. However GPT-J (as it is being used in this paper) doesn't seem to be able to correctly pair the claim that there is an error with an actually erroneous code example, or be able to correct an incorrect piece of code.

In the following two examples, GPT-J gives a correct solution and then creates a response that represents it as being incorrect. The "correction" doesn't actually change the code, and proposes swapping out a segment of the code for a functionally identical segment.

Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, normal, overweight, or obese.\A person is underweight if $BMI < 18.5$; normal if $18.5 \leq BMI < 25$; overweight if $25 \leq BMI < 35$; or obese if $BMI \geq 35$.\$Height = feet * 12 + inches$\$BMI = weight * 0.45359237 / (height * 0.0254)ˆ2$ I tried to solve the problem but the program is not working. It shows only "Can't find variable weight" In this next example generation, GPT-J provides three candidate solutions but only one of them is correct. Fortunately, the user can simply run the code to evaluate them without having to understand the problem's specification. Note as well that in all three examples GPT-J omits code for reading in the input and instead hard-codes the array to be reversed. As a result, we mark this generation as "partially correct." Write a Java program that accepts 10 integers and shows them in reversed order. 2, 3, 4, 5, 6, 7, 8, 9, 1}; for(int i = a.length-1; i >= 0; i--) { System.out.print(a[i]); } }

There are several recurring themes in the mistakes GPT-J makes. We document them here for completeness, and expect them to be of particular interest to HCI researchers in the future.

Although GPT-J generally has a high quality understanding of mathematics, it regularly makes mistakes in the geometric formula for the area and the volume of a cylinder in Exercise 2. The correct formulae are A = 2πr 2 + 2πr and V = πr 2 . While these are not provided in the problem statement, they are readily avaliable on the internet so we assume a student is able to correct them.

Write a Java program that accepts the radius \& length of a cylinder and prints the area \& volume of that cylinder. All inputs and outputs are real numbers. The program is to be given as input and output and be stored in a file named "CylinderArea.java". System.out.print("Enter radius: "); radius = sc.nextDouble(); System.out.print("Enter length: "); length = sc.nextDouble(); double area; area = Math.PI * radius * radius; double volume; volume = 2 * Math.PI * radius * length; System.out.println("The area of the cylinder is " + area + " and the volume is " + volume + "."); } } Write a Java program that accepts the radius \& length of a cylinder and prints the area \& volume of that cylinder. All inputs and outputs are real numbers. System.out.println("Enter radius of the cylinder:"); rad = scan.nextInt(); System.out.println("Enter length of the cylinder:"); len = scan.nextInt(); area = Math.pow(rad, 2) * len; volume = Math.pow(len, 3); System.out.println("The area of the cylinder is : " + area); System.out.println("The volume of the cylinder is : " + volume); scan.close(); } } It will do the trick.

The other common mathematical mistake occurs in Exercise 3, computing BMI. While the previous completion shows GPT-J correctly computing exponents using Math.pow(len, 3);, in Exercise 3 GPT-J consistently expresses exponents using the carrot symbol. We hypothesize that this is caused by the use of L A T E Xto express the equations in the statement of Exercise 3, showing GPT-J the carrot symbol which it proceeds to copy despite seeming to know that that is incorrect syntax in Java. 

Another common failure mode for GPT-J is inventing new problem constraints that do not actually exist in the problem. In this example, GPT-J decides to round the bmi variable, despite the fact that a decimal value is desired. Other errors also exist in this piece of code, but they are easily fixed by rereading the problem specification and the overall logic is correct. 

A related failure case occurs because our inputs do not clearly delineate where the question ends and where the answer begins. It is common when working with prompted datasets to use special tokens or natural language queues to tell the model that the question is over and it should produce the answer, a practice we avoid to keep our inputs as close to the original assignments as possible. As a result, on several occasions GPT-J continues the question statement and then proceeds to try to answer the modified question.

In this next example, GPT-J adds the requirement that the file be called CylinderArea.java. Due to the way Java semantics works, this implicitly requires that the name of the public class be CylinderArea, a requirement GPT-J complies with.

Write a Java program that accepts the radius \& length of a cylinder and prints the area \& volume of that cylinder. All inputs and outputs are real numbers. The program is to be given as input and output and be stored in a file named "CylinderArea.java". The resulting generation is ultimately marked as correct, though it needs to have some import statements added and the mathematical formulae modified. System.out.print("Enter radius: "); radius = sc.nextDouble(); System.out.print("Enter length: "); length = sc.nextDouble(); double area; area = Math.PI * radius * radius; double volume; volume = 2 * Math.PI * radius * length; System.out.println("The area of the cylinder is " + area + " and the volume is " + volume + "."); } } We see the same situation play out in another Assignment 2 completion, where GPT-J specifies the order the inputs and outputs should be presented in and then complies with its requirement.

Write a Java program that accepts the radius \& length of a cylinder and prints the area \& volume of that cylinder. All inputs and outputs are real numbers. Input and output are in the following order:

Input: radius \& length of cylinder 

The following are examples of correct solutions generated by GPT-J for each exercise.

Write a program that prints "Welcome to Java" five times.

Write a Java program that prints Welcome to Java'' five times.

Substitute the character in the message string for an integer, for example, Welcome to Java'', and substitute the integer for a character.

The program should run until the user enters the character e'' to terminate the program.

When the user enters the character e'' the program should print the character e''.

The user should be able to enter the integer between 1 and 10.

The user should be able to enter the character between A and Z.

The user should be able to enter the character between 0 and 9.

The user should be able to Generation 1-2: Not code

Write a Java program that prints Welcome to Java'' five times.

Write a Java program that prints Welcome to Java'' five times.

-

Write a Java program that prints Welcome to Java'' five times. for(int i=0;i<5;i++) { System.out.print("Welcome to Java"); } Generation 1-5: Not code Write a Java program that prints Welcome to Java'' five times.

The program is required to be written with a single main method. The main method must be called using the following statement:

The program must be organized into three classes. { System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); } } import java.io.*;

public class Java { public static void main(String args[]) { System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); } } Generation 1-7: Not code Write a Java program that prints Welcome to Java'' five times. The program should use a loop to print the message. Write a Java program that prints Welcome to Java'' five times.

Answers I got this:

Welcome to Java Welcome to Java Welcome to Java Welcome to Java Welcome to Java I know that this is not what you wanted, but I have no idea how you could make that happen. If you give me a little more information, I can probably give you a better answer. Write a Java program that prints Welcome to Java'' five times.

This is a sample question from the JSP assignment. System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); } } Generation 1-10: Misc. error

Write a Java program that prints Welcome to Java'' five times.

Problem:

The main problem is the output of the program. The program must print out the string Welcome to Java'' five times.

Here is the program that prints the welcome message five times. System.out.println("Welcome to Java!");

for (int i = 0; i < 5; i++) { System.out.println("Java is a great language!"); } } } Output:

Welcome to Java! Java is a great language! Java is a great language! Java is a great language! Java is a great language! /* for (int i = 0; i < 5; i++) { System.out.println("Welcome to Java!"); } } } Generation 1-11: Syntax error Write a Java program that prints the string "Welcome to Java" five times. The text is displayed on a single line and the text is centered in the line. The program uses the font Courier New. Write a Java program that prints the string "Welcome to Java" five times. It will be a good exercise to learn how to format the text and the number of printouts.

→ Print "Welcome to Java" five times using "

" as the newline.

To get started, let's start with a simple class. System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); System.out.println("Welcome to Java"); // read input from the user System.out.println("Enter your name"); String name = sc.nextLine(); // print welcome message System.out.println("Welcome to Java, " + name + "!"); // print welcome message again System.out.println("Welcome to Java, " + name + "!"); // print welcome message again System.out.println("Welcome to Java, " + name + "!"); // print welcome message again System.out.println("Welcome to Java, " + name + "!"); // print welcome message again System.out.println("Welcome to Java, " + name + "!"); // print welcome message again System.out.println("Welcome to Java, " + name + "!"); // read input from the user System.out.println("Enter your age"); int age = sc.nextInt(); // print welcome message System.out.println("Welcome to // print welcome message System.out.println("Welcome to Java."); // print welcome message again System.out.println("Welcome to Java."); // print welcome message again System.out.println("Welcome to Java."); // print welcome message again System.out.println("Welcome to Java."); // print welcome message again System.out.println("Welcome to Java."); } } Generation 1-15: Misc. error

Write a Java program that prints the string "Welcome to Java" five times.

Here is the solution: Here are some of the things that are happening:

In the for loop, k starts at zero. It is incremented by 1 each time, so the first time k = 0, the second time k = 1, the third time k = 2 and so on. System.out.println("The area of this cylinder is " + area); double volume = cylinderLength * cylinderRadius * cylinderLength;

System.out.println("The volume of this cylinder is " + volume); } } A:

If you have a cylinder of radius r and length L, the area is πr² + πrL. The volume is πr³/3. The formula for calculating the area of a cylinder is based on the area of a slice of the cylinder. In your case, the area of a slice is πr² and the length of the slice is L -r. Therefore, the area of the cylinder is πr² + πr(L -r). System.out.println("Please enter the length of the cylinder: "); double cylinderLength = sc.nextDouble();

System.out.println("Please enter the radius of the cylinder: "); double cylinderRadius = sc.nextDouble();

double area = 2 * cylinderRadius * cylinderRadius * Math.PI + 2 * cylinderRadius * Math.PI * cylinderLength; → System.out.println("The area of this cylinder is " + area);

double volume = Math.Pi * cylinderRadius * cylinderRadius * cylinderLength; System.out.println("The volume of this cylinder is " + volume); } } Generation 2-2: Not code Write a Java program that accepts the radius \& length of a cylinder and prints the area & volume of that cylinder. All inputs and outputs are real numbers. → 1.

The radius of the cylinder is entered on the command line.

2.

The area of the cylinder is printed to the command line and the volume of the cylinder is printed to the command line.

→ 3.

The radius is an integer and is stored in the variable radius and the length is an integer and is stored in the variable length. 

The Scanner class is used to read the input values.

The Math class is used to perform the calculations.

6.

The System class is used to perform the calculation.

7.

You should have the following output when you run the program:

The area of the cylinder is 3.0. The volume of the cylinder is 12.0.

Compile the code Compile the code using the javac command.

javac Area.java 9.

Run the code Run the code using the java command.

java Area 10.

Test the code Your program should produce the following output when you run the program:

The area of the cylinder is 3.0. The volume of the cylinder is 12.0.

You should read the book, Java by Brian Kernighan and Dennis Ritchie. It is the best book on programming.

→ You can use the if statement to check the input value.

You can use the switch statement to check the type of the input value.

You can use the double statement to check if the value is a real Generation 2-3: Not code

Write a Java program that accepts the radius \& length of a cylinder and prints the area & volume of that cylinder. All inputs and outputs are real numbers.

A cylinder is the three-dimensional shape formed by a circle lying on the XY plane and a vertical line passing through the center of the circle. A cylinder is a solid object that can be sliced in two perpendicular planes. A cross-section of a cylinder looks like a right circular cylinder. The problem is as follows:

Write a Java program that accepts the radius \& length of a cylinder and prints the area & volume of that cylinder. All inputs and outputs are real numbers.

The program should accept two inputs and print two outputs. The first input should be the radius of the cylinder and the second input should be the length of the cylinder. Write a Java program that accepts the radius \& length of a cylinder and prints the area \& volume of that cylinder. All inputs and outputs are real numbers.

We are given the radius and the length of the cylinder. To find the volume we use the formula \& area of a cylinder = r*l*h where \& r = radius, \& l = length, \& h = height.

We are given the radius and the length of the cylinder. To find the area we use the formula \& area of a cylinder = r*l*h where \& r = radius, \& l = length, \& h = height.

→ A cylinder is defined as a solid object having a circular base and a circular top and whose height is → Generation 2-8: Syntax error and misc. error

Write a Java program that accepts the radius \& length of a cylinder and prints the area \& volume of that cylinder. All inputs and outputs are real numbers. System.out.println("Enter the radius of the cylinder"); radius = keyboard.nextDouble();

System.out.println("Enter the length of the cylinder"); length = keyboard.nextDouble(); a = 2 * Math.pow(rad, 2) * Math.PI + 2 * rad * Math.PI * len; v = Math.pow(radius, 2) * Math.PI * length;

System.out.println(a, v) } } Generation 2-9: Not code Write a Java program that accepts the radius \& length of a cylinder and prints the area \& volume of that cylinder. All inputs and outputs are real numbers.

→ A:

I have done this before, you just need to calculate the volume of the cylinder using the cylinder formula.

Where r is the radius and h is the height.

A:

The area of a cylinder is the circumference of its base multiplied by its height. You have to keep track of your coordinates for the circle. (e.g., (r*r -rˆ2)*pi/4 = rˆ2*pi/2, which is the area of a circle) System.out.println("Enter radius of the cylinder:"); rad = scan.nextInt(); System.out.println("Enter length of the cylinder:"); len = scan.nextInt(); area = Math.pow(rad, 2) * len; volume = Math.pow(len, 3); System.out.println("The area of the cylinder is : " + area); System.out.println("The volume of the cylinder is : " + volume);

scan.close(); } } It will do the trick. System.out.println("The area of the cylinder is : " + area); System.out.println("The volume of the cylinder is : " + volume);

scan.close(); } } Generation 2-12: Syntax error and misc. error Write a Java program that accepts the radius & length of a cylinder and prints the area & volume of that cylinder. All inputs and outputs are real numbers. The program is to be given as input and output and be stored in a file named "CylinderArea.java". In this case, the cylinder is a circle. We can use the simple formulas:

The area of a circle is pi times the radius squared. So we can write:

But we can't use pi directly. We need to convert from a radian measure of angle to a degree measure of angle. The conversion factor is 2 pi divided by 360 degrees. So we have:

The volume of a circle is pi times the radius cubed. So we can write:

But we need to convert from a radian measure of angle to a degree measure of angle. The conversion factor is 2 pi divided by 360 degrees. So we have: → Finally, we need to calculate the radius from the length. To do that, we use the formula for the circumference of a circle: → where _C_ is the circumference, _d_ is the radius, and _r_ is the length.

So we have: # Chapter 6. Coordinate Geometry Coordinate geometry is a collection of mathematical techniques for determining the position and orientation of an object with respect to a given coordinate system. It is used to figure out where objects are relative to each other, and to understand their relationships with respect This is because the output of the first input is stored in the variable radius, and the output of the second input is stored in length. When the program starts, length is initialized to 0, so the second output is not printed. 

Write a program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, normal, overweight, or obese. A person is underweight if BM I < 18.5; normal if 18.5 ≤ BM I < 25; overweight if 25 ≤ BM I < 35; or obese if BM I ≥ 35. Height = f eet * 12 + inches BM I = weight * 0.45359237/(height * 0.0254) 2 .

Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, One way to do it is to use the BMI formula (which is already correct), but also to include the height and weight of the person. → So the BMI formula becomes $$BMI = rac{weight*0.45359237}{heightˆ2}$$ And if you do the math, you get that $BMI = rac{weight*0.45359237}{heightˆ2}$ So the BMI formula is already correct. The problem Generation 3-3: Not code

Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, Re: Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, normal, overweight, or obese.\A person is underweight if $BMI < 18.

→ → → → Generation 3-6: Not code Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, Write a Java program that accepts the name of a person and returns the number of letters in their name.

→ An example of a valid name would be: \John".

An example of a valid input would be: \Seth".

An example of a valid output would be: \4". ]]>http://www.javapractices.com/input-output Generation 3-7: Not code Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, Enter the weight and height of a person (in pounds and inches respectively) 1.

Enter the weight and height of a person (in pounds and inches respectively) Weight :

Height :

6.0

Enter the weight and height of a person (in pounds and inches respectively) Weight :

Height :

6.5

Enter the weight and height of a person (in pounds

Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, 

Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, normal, overweight, or obese.\A person is underweight if $BMI < 18.5$; normal if $18.5 \leq BMI < 25$; overweight if $25 \leq BMI < 35$; or obese if $BMI \geq 35$.\$Height = feet * 12 + inches$\$BMI = weight * 0.45359237 / (height * 0.0254)ˆ2$\If $feet = 0$, $inches = 0$, $weight = 0$, or $height = 0$, the program will display an error message.\If the input is not valid, the program will display an error message. Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, Note: This is an online coding challenge, and the submission will be saved in the public repository on this website.

Step 1

Create a file with the following contents: Step 2

Make sure that you have followed the instructions in the first step properly. You should have created a new class with the given name.

Step 3

Open the command prompt and execute the following command:

javac WeightAndHeight.java

Step 4

Open the command prompt and execute the following command:

java WeightAndHeight

You should get the following output: Write a Java program that accepts the weight (as a real number representing pound) and height (as two real numbers representing feet and inches respectively) of a person. Upon accepting input, the program will show that person's BMI (real number) and a piece of information stating whether the BMI is categorised as underweight, normal, overweight, or obese.\A person is underweight if $BMI < 18.5$; normal if $18.5 \leq BMI < 25$; overweight if $25 \leq BMI < 35$; or obese if $BMI \geq 35$.\$Height = feet * 12 + inches$\$BMI = weight * 0.45359237 / (height * 0.0254)ˆ2$\This program is written to be done in an interactive environment (using command line). The program will not terminate unless the user inputs a quit command. The program should print the information and ask for a new input if the user inputs something other than an integer.\The user should be prompted for input in the I suggest a more programmatic approach.

Make a method that converts

Write a program that shows a conversion table from miles to kilometers where one mile is equivalent to 1.609 kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation. Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

The solution demonstrates how to write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

→ → →

The answer given to the problem is correct and valid.

Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation. Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation. Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

The program should not prompt the user for the number of kilometers to convert. The program should convert the specified number of miles to kilometers and output the converted number of kilometers to the console.

The number of kilometers to convert is to be provided as an input to the program.

Below is the program which I have written. However, it gives me an error. Note that the program should not prompt the user for the number of kilometers to convert.

You cannot post new topics in this forum. You cannot reply to topics in this forum. You cannot delete your posts in this forum.

Generation 4-6: Not code

Write a Java program that shows a conversion table from miles to kilometers where one mile is equivalent to $1.609$ kilometers. The table should display the first ten positive numbers as miles and pair them with their respective kilometer representation.

→ → I've seen this problem asked in the past and I think I got the answer right, but I don't really understand the point of the exercise. I know the conversion is $1 rac{miles}{km} = 1.609 rac{km}{miles}$, but how does that help? Why are there ten values, how does the problem relate to miles to kilometers? → → → Any help would be appreciated.

The exercise is simply to check your knowledge of the conversion between miles and kilometers. This is a little bit more subtle than it seems.

Let us denote the number of kilometers by $n$. Then, egin{align} 1 

array = new String

for (String s : array) {

You can use Collections.reverse() method to reverse the order of the elements of an array

for (String s : array) {

public static void main(String args[])throws IOException { System

array = new int

Enter the numbers to be reversed

array = new int

We would like to thank Leo Gao, Maya Fuches, and Laria Reynolds for their feedback on the paper.

In your for loop, the last value of i is going to be the length of the array, so you will always be looping over the last item in the array. You should be looping from 0 to 9 inclusive: It's giving the output in ascending order Generation 6-12: Partially correct (doesn't read an input) Write a Java program that accepts 10 integers and shows them in reversed order.We can use a for loop to reverse the elements in an array. The code will look like this: } System.out.println(Arrays.toString(numbersReversed)); } } As you can see, we use a for loop to loop through all the elements of the array and store the reversed elements in the numbersReversed array.

Let's see what the output will be in the console: [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] The code above is working fine, but the problem is, the output is getting printed in the same order as the original → Generation 6-13: Partially correct (doesn't read an input) Write a Java program that accepts 10 integers and shows them in reversed order. → Please note that the matrix of numbers is not assumed to be symmetrical. For example, the following is a legal matrix:→ The matrix is represented as an array of doubles. The length of the matrix should be equal to the length of the array, and the first element of the array should contain the first row of the matrix. The array should be read-only and should not be modified.

We can express the program as a method of the class 	extit{SumMajorDiagonal} and pass the array 	extit{m} to the method. The following is an example of the solution:→ This is a good way to start working on a Java program, and it is not a bad idea to start by using a similar approach in another language, say C++. We can do this for example by using the 	exttt{std::vector} class in C++. The parameter $k$ which will be the number of rows on the matrix.

The parameter $r$ which will be the number a main method which prompts the user to enter a matrix a method that prompts the user to enter values for each cell of the matrix a method that calculates the total of all numbers placed on the leading diagonal a method that calculates the sum of all numbers in the matrix.All input and output operations must be via the console. → You may use any method you like to calculate the sum of all numbers in the matrix, but you must make sure that the method you use is correct and that it is also correct when the matrix has values other than numbers.