BLE Blog: Birkbeck's GenAI survey of academics

Rishi Shukla, one of Birkbeck's Digital Education Consultants, shares findings of a survey of academic staff regarding their students' use of Generative Artificial Intelligence.

Overview

Birkbeck’s Digital Education team issued a short questionnaire on the teaching body’s experiences of monitoring and addressing use of generative artificial intelligence (GenAI) in coursework and exam submissions. The questionnaire was completed by 17 staff, with participation from all faculties:

Graph 1: Distribution of survey respondents by Faculty

Questions were focussed on two areas. The first area focused on informal methods used by staff for identifying likely misuse of GenAI. The second area addressed adaptations to teaching and learning methods and materials made by staff in response to the growth of GenAI tools. Findings related to each area are detailed in the following sections, which highlight commonly identified issues and include representative comments drawn directly from survey responses.  In this summary report, we'll be breaking down the responses question by question for the survey below.

Spotting suspected GenAI use

What words/phrases do you find prevalent in student submissions where AI use is suspected?

A common observation amongst responders was that there were not necessarily any specific words or phrases that acted as a ‘red flag’ in themselves. More typically, staff tended to note that aspects of the vocabulary, tone or authorial voice were likely to raise concerns. The table below shows the aspects of vocabulary that raised concerns.

repetition	“Multiple repetitions of words or phrases seems a potential giveaway”
exaggeration	“Words that over emphasize or overstate the significance of a point made”
vagueness	“Students using AI use buzzwords of the discipline without really engaging with them”
assuredness	“It's the breezy/polished tone of someone who is very well practiced in churning out this style of prose that contrasts markedly with how our students typically speak and write”

Some potential trigger words or phrases were also noted by a smaller proportion of respondents (including staff from all three faculties), which included: “In conclusion …”, “There are [X] main points …”, “Certainly …”, “overall”, “delve”, “some”, “others”.

Please describe if there are any grammatical clues you often find in student submissions where AI use is suspected?

The most widely shared observation was that grammatical composition of individual sentences tended to be (overly) perfected. Multiple respondents also noted that such uncharacteristically precise use of English is often accompanied by either extreme verbosity (“use of suspiciously many adjectives”), excessive formality (“older words/phrases not in common usage”), weak argument structure (“making similar points with different wording in different sections of an essay”), or insufficient depth (“lacking in actual content beyond a few superficial facts”).

Please describe the nature of any visual clues of suspected AI usage in student submissions?

Two visual markers were highlighted repeatedly by several survey respondents.

The first sign was superfluous or inappropriate use of bullet points, or very short paragraphs tantamount to the same. It is worth noting the possibility that students may deliberately construct submissions in this way, as an attempt to circumvent detection tools. Bullet points are not currently classed as “qualifying text” by the Turnitin AI writing indicator, so these passages are not subject to the same level of automated scrutiny[1]. There may be some level of awareness amongst students about this current limitation of Turnitin.

The second indicator was a reference list that appears after the body of the work (rather than inline within the text) or even single, ‘catch-all’ citations at the end of paragraphs using a generic source (i.e. textbook). One member of staff also highlighted mixed use of US/UK spellings as another visual warning (e.g. “analyse” and “analyze”, “dialogue” and “dialog”).

Have you found examples of fabricated references? How did you find out these references were fabricated?

Eleven (65%) of the survey respondents had encountered fabricated references. A variety of scenarios for the extent or nature of the fabrication were noted by multiple respondents, including:

citations that did not include full publication details (“repeated use of 'Smith (2023); Jones (2023); Brown (2023)' as in text citations but without a bibliography provided”)

valid sources that were not pertinent to the topic (“some references are cited but have no relevance to the subject material”)

entirely bogus constructs (“combinations of authors that have not published together, a plausible but non-existent title, a real journal and volume […] but page numbers that don't exist”)

Respondents conveyed clearly that each type of fabrication indicated a likelihood that the submission had been produced with use of GenAI. Two respondents noted explicitly that use of fabricated references was the most reliable evidence of GenAI misuse.

Staff reported using a combination of domain/subject knowledge for initial validation of reference integrity, followed by manual checking of any suspect or unfamiliar references through online search. One respondent noted that more recent iterations of GenAI tools have improved such that these tell-tale signs may no longer be so prevalent:

 “I was interested in the results the student claimed were in the paper so I tried to access it. The journal was legitimate and the year and volume numbers corresponded, but there was no such paper on the claimed page numbers.  I checked the alleged authors' websites (they were genuine academics) and saw no sign of this or any related paper, and googling the title produced no hits.  Note that this occurred in 2023 whereas in 2024 I am no longer seeing fabricated references on suspect papers (I have done several checks and found only legitimate books/articles.)”

Amending teaching and learning

Have you amended your teaching content as a result of the increase in the use of AI? If so, how?

Nine (53%) of the survey respondents had amended their teaching in response to an apparent increase in the use of GenAI. Staff provided a wide variety of shared measures they’d taken in response to these trends:

highlighting institutional policy and the consequences of GenAI misuse (“[…] incorporating warnings that AI usage is treated as plagiarism at Birkbeck”)

demonstrating and explaining the flaws in content produced with GenAI (“I have shown the students examples of AI generated content […] and had them critique why the AI has done a bad job”)

shifting the learning emphasis to modes that are more robust to GenAI misuse (“I have put greater emphasis on being able to draw original diagrams to explain scientific content”)

adapting course design so that learning builds directly towards assessment (“[…] making the teaching of a topic even more tightly tied to the related assignment”)

Have you amended your assessment methods as a result of in the increase in the use of AI? If so, how?

Nine (53%) of the survey respondents had amended their assessment methods in response to an apparent increase in the use of GenAI. Seven of these nine were from the set of respondents that had also amended their teaching content. Three main strategies were apparent for adapting assessments:

pre-test AI output	“I think carefully about whether questions could be answered by AI, and sometimes check ChatGPT directly to see what sort of answer it provides”
revise criteria	“… emphasize critical reflection on content rather than the content itself; I also ask students to refer specifically to course materials that are only available on Moodle …”
revise assessment	“I give 50% of the marks for text and 50% for hand-drawn annotated diagrams where I expect the handwritten annotation to explain the relevance of the diagram to the text, and the overall answer”

Have you allowed the use of AI in your assessment? If so, how?

Three (18%) of the survey respondents had permitted some use of GenAI in their assessments. All three indicated that clear parameters had been laid out on how the technology could be used (i.e. only to generate seed information, to create a plan, or in place of a search engine for research purposes) and that output from GenAI tools could categorically not be used as the basis of a submission.

Future work

Future work in this area could include running this survey regularly to increase the numbers/information gathered as AI tools and students usage of AI Tools also evolves. Whilst we would have liked to have looked into disciplinary differences, we would have needed larger numbers of respondents from across the faculties. This is also something that can be investigated further should the survey run again. 

Another area this work will influence is the training and guidance offered by the Digital Education team. The Digital Education team in the coming weeks will be releasing a Sharepoint Connect site, with information on ways to incorporate AI into teaching, learning and assessment as well as ways to rework assessment across the module/programme. The Digital Education team also offers training on interpreting the AI Indicator and will be adding and amending more sessions and workshops on assessments in the coming months.

Summary

From a small sample of Birkbeck teaching staff, this report provides a snapshot of the different experiences and adaptations underway due to the growing influence of GenAI over student learning activity.

The first section presents a number of common indicators that may give rise to concerns around misuse of GenAI. The second section highlights a range of different initiatives that staff have already undertaken, individually, to steer teaching and assessment methods in directions that successfully address these new challenges. 

These two sets of insights provide valuable food for thought and inspiration for other Birkbeck teaching staff, on how they may respond to the growing availability and ever-changing capability of GenAI tools.

Acknowledgments

The data evaluation and write-up was done by Rishi Shukla and the survey and review of the evaluation conducted by Yahya Saleh from the Digital education team. 

References

1 - Turnitin (2024). AI writing detection in the classic report view. Available at: https://guides.turnitin.com/hc/en-us/articles/28457596598925-AI-writing-detection-in-the-classic-report-view#h_01J2XZ72FDQQG4C1SET16FXZJ4 (Accessed: 10 October 2024)

BLE Blog

Pages

Friday, 25 October 2024

Birkbeck's GenAI survey of academics

No comments:

Post a Comment