AI Ethics and Research

Image generated by Google Deepmind. Aside from this image, none of the text or content of this article has been AI-generated or refined.

In a little over five minutes, Adam Aleksic’s Ted Talk about social media, language trends, and AI software completely changed my relationship with writing. It’s no great secret that the tools Aleksic is talking about—Sora, ChatGPT, Claude, Gemini, and others—have dramatically changed the way we approach everyday tasks, from writing simple sentences, to gathering information online. But the rise and fall of linguistic trends and the inherent ways we train programs like ChatGPT and Claude invite a lot more questions than answers. Aleksic is incisive in his critique about AI and the way these tools, often trained using social media posts and other freely available online media, falsely inflate consumer trends and language behaviors. He points to popular fads driving consumer spending and new colloquial words that have entered our vocabulary almost without our noticing. I think Aleksic’s talk offers a lot to the everyday social media and ChatGPT user; but it also means that the pervasive use of AI—from image generation to text editing—must be examined and questioned deeply. I contend that there is no better way, nor perhaps important realm, to examine the concept of AI ethics than in research. From the mundane to the complex, from text generation to data analysis, AI is finding a foothold in academic research. Yet, our ability to determine where and how to use it seems to be limited, especially when students and our academic work in the University is concerned.

The provost’s statement on AI regarding student work is concise, but sweeping. It broadly prohibits the use of AI on assignments or exams yet leaves the concrete question of AI permissibility up to individual instructors. That aspect of the policy diffuses responsibility for classroom AI use across the university and may understandably put pressure on instructors to come up with their own principles of AI (mis)use. Later parts of the statement address wider questions of AI and research ethics by establishing some hardline expectations. It pushes for transparency, offers guidance about confidential information, and considers how copyrighted information might be used to train AI models. My issue with Columbia’s AI policy is not simply that the question of the student-researcher is left mostly ambiguous, but that the policy itself grapples more with responsible use rather than responsible sourcing. It provides just enough scaffolding to leave open the question of AI in the classroom and to establish standards around AI in research but fails to go much deeper than that.

To be clear, when I talk about the limitations of Columbia’s AI policy, I’m not advocating for the complete elimination of AI from research or academic processes. Speaking from personal experience, I find AI useful in generating Excel scripts for complex data analysis, matching datasets together, and even refining sentences that sound clunky or incoherent on first blush. But these tools are far more complex than the information that we give them. For instance, I had a conversation with my Senior Thesis seminar about Otter.ai, a software that uses AI to help transform voice recordings into interactive transcripts that can even distinguish between different speakers. The question was whether the audio that we give to the platform can be used to train the AI software that Otter.ai uses for transcription. Carefully reviewing the site policies, we found out that the answer was yes; what ensued was an important conversation about research and IRB ethics that could have benefited substantially from more thoughtful university discourse on AI. These conversations are not new. In classrooms, during my internship last summer, and even over break, I find myself talking with different people about AI, the way we use it, and the way the software is trained. And I think the key idea missing from Columbia’s AI policy—and what I think is more essential now than ever before—is the way we consider how and where these tools are trained.

Aleksic’s talk already demonstrated how the term “delve” began appearing in far higher rates in ChatGPT responses than in everyday conversation, likely resulting from the labor and training material given to the model. Consequently, the term began appearing increasingly often in casual parlance. The systems, sources, and people who help train models like Claude and ChatGPT introduce linguistic patterns that are later reproduced in the responses we get. More concerningly, if the material used to refine these models displays ideological biases, the worry isn’t just whether those biases are reproduced later, but whether they have the power to influence the public in the same way that “delve” did. AI training and refinement has undeniable downstream consequences, and it’s time to start thinking about how the world of research will adapt. Even the non-generative systems we use have consequences. The large language models (LLMs) that we use are only as good as what we give to them; from API-generated embeddings to Retrieval Augmented Generation (RAG), analyzing language is almost never done in a vacuum and resists any measure of objectivity we may award to LLM’s computerized workflow.

It’s time to start chipping away at the AI monolith, and to understand that each tool is fundamentally different, beyond the marketed capabilities we associate with each model. Claude might be great at coding, and ChatGPT might excel at writing, but the more important concern should be the larger world we step into as soon as we engage these models in conversation. Like it or not, when we do so, we confront an unimaginably vast trove of digital information that may have nothing to do with our tasks at hand. These externalities have already found a way to seep into daily life and if we let that process go unchecked, we may find that the fundamental process of thinking—researching, writing, even speaking—like a human starts to disappear.

Categories

Archives