Digital Economy Dispatch #188 -- AI in Software Engineering: The Battle for the Soul of Software

Digital Economy Dispatch #188 -- AI in Software Engineering: The Battle for the Soul of Software
16th June 2024

Over the past 30 years, I have written, read, and reviewed a lot of software. From accounting and stock control systems to real-time process control modules, I have had the chance to work with code in many forms and in several programming languages. Much of it involved basic logic for manipulating data, processing transactions, or interacting with users. However, some pieces were complex, subtle, highly specialized, or deeply integrated within a mass of existing code that had been poked and patched a dozen times before I got there.

Looking at the software didn’t just allow me to say what it did. It also allowed me to say where it came from. Ask anyone who has spent a long time staring at code and they will tell you that every piece of software tells a story. Whether it has been developed by individuals or teams of developers. The skills and experience of the people who created it. If the code has been output automatically by a code generator, reused, or rewritten. Whether it has seen many changes and upgrades. You could say that every software system has a soul.

With a new wave of AI-powered software engineering tools appearing, it will be interesting to see their effects on the software they create. More software can be generated more quickly. Large software systems can be reviewed, documented, and standardized with ease. In fact, the prospect is that much of the next generation of software could be created by AI tools. An idea that both excites and scares me in equal measure.

What will be the role of software engineers in the future? Is AI going to take over the task of software development? Will AI change the way software is designed, delivered, and deployed? And if so, should we be pleased or start running for the hills?

When Software Meets AI

While AI has emerged as a transformative force across various sectors, it appears that software engineering may be one of the areas where AI has the deepest impact. As the demands for faster development cycles, higher code quality, and seamless integration of complex systems grow, AI offers a suite of tools and techniques that can significantly enhance the software development process.

Indeed, Gartner is bold enough to suggest that by 2027, 70% of professional developers will use AI-powered coding tools, up from less than 10% today. This use of AI in software engineering will focus in several areas: generative AI for code creation, AI-driven testing, code translation from legacy languages, documentation generation, and the enforcement of best practices within development teams.

Generative AI for Code Generation

Investigations into the use of AI in software engineering have been ongoing for some time, but it is generative AI, particularly models like OpenAI's Codex and GitHub's Copilot that have revolutionized code generation. These advanced models can interpret natural language descriptions and translate them into functional code snippets, effectively acting as sophisticated auto-completion tools. For instance, a developer can describe a function they need, such as "a Python function to sort a list of dictionaries by a specific key," and the AI can generate the corresponding code. This capability significantly reduces the time spent on boilerplate code and accelerates the prototyping phase.

However, it is important not to get carried away. Examples like this form only a small part of the job of a software engineer. The generated code often requires human review to ensure correctness, security, and efficiency. Furthermore, developers must understand the underlying logic and not treat AI-generated code as infallible.

AI in Testing Complex Code

Testing is a critical aspect of software engineering, often consuming significant resources. AI enhances this phase by automating test case generation, execution, and identifying potential bugs through predictive analytics. Machine learning models can analyze codebases to predict areas prone to errors, allowing developers to focus their testing efforts more effectively.

For example, AI-powered tools like DeepCode and TestGrid use machine learning to detect potential bugs and vulnerabilities in code by learning from vast repositories of open-source projects. These tools can identify patterns that might lead to errors, such as common pitfalls in specific programming languages or frameworks. By automating the identification of these issues, AI reduces the manual effort required in the testing phase and increases the overall reliability of the software.

Moreover, AI can optimize regression testing by selecting the most relevant test cases to run when code changes are made. This selective testing ensures that modifications do not introduce new bugs, maintaining software stability without the need for exhaustive testing of the entire codebase.

Code Translation and Rewriting for Legacy Systems

Many organizations still rely on legacy systems written in outdated programming languages. Maintaining and updating these systems is challenging due to a shrinking pool of developers proficient in these languages and the inherent limitations of the outdated code. AI offers a solution by automating the translation and rewriting of legacy code into modern programming languages.

For instance, IBM's AI-powered tools can assist in converting COBOL code, which is still prevalent in many back office systems, into more modern languages like Java or C#. This process involves understanding the old code, mapping its functionality to the new language, and ensuring that the new code performs identically to the original. The AI models used in this process are trained on vast amounts of legacy and modern code, enabling them to handle the intricacies of different programming paradigms.

Automating this translation extends the life of legacy systems and makes them more maintainable and scalable, ensuring that organizations can continue to leverage their existing investments while transitioning to modern technologies.

Generating Documentation for Legacy Code

One of the most time-consuming aspects of working with legacy systems is understanding the existing codebase, often lacking adequate documentation. AI can significantly mitigate this issue by automatically generating documentation for legacy code. Tools like docify can analyze code and produce human-readable documentation that explains the purpose and functionality of different components.

Additionally, AI can keep documentation up-to-date by continuously analyzing code changes and updating the corresponding documentation sections. This dynamic approach ensures that documentation remains accurate and reflective of the current state of the code, which is essential for maintaining high-quality software.

Applying Best Software Engineering Practices

AI is also instrumental in enforcing best software engineering practices and code consistency within development teams. Tools like SonarQube and CodeGuru use machine learning to analyze code quality and adherence to best practices. These tools can provide real-time feedback to developers, highlighting potential issues related to code style, security vulnerabilities, performance bottlenecks, and more.

For instance, an AI-powered code review tool can analyze a pull request and suggest improvements based on established best practices. It can flag potential security issues, such as SQL injection vulnerabilities, recommend performance optimizations, and ensure that the code adheres to the team's coding standards. By integrating these tools into the development workflow, teams can maintain high standards of code quality and security.

Furthermore, AI can facilitate better project management by analyzing historical project data to predict potential risks and bottlenecks. Predictive analytics can help project managers make informed decisions about resource allocation, timeline adjustments, and risk mitigation strategies, ultimately leading to more efficient and successful project outcomes.

An Example: Google’s Experiences with AI for Software Engineering

These descriptions of AI use are helpful. But what really matters is how they come together in an organization that is creating large-scale software-intensive systems. We recently gained very valuable insight into this at Google. They described how they have evolved their use of AI in software engineering activities.

They relate that in 2019, the use of AI in software engineering was largely theoretical for most developers. Fast forward to 2024, and there is now significant enthusiasm among Google’s software engineers for AI's role in coding. Many engineers now regularly use ML-based autocomplete tools that enhance coding efficiency. This shift reflects a broader transformation in internal software development tools at Google, driven by AI.

Google's AI-powered tools have notably improved various stages of the software development process. These tools include Interactive Development Environments (IDEs), code reviews, and bug management systems, aiming to boost productivity and developer satisfaction. The company's approach involves prioritizing ideas that are both technically feasible and likely to have a significant impact, learning quickly from iterative development, and measuring the effectiveness of these tools. With the introduction of transformer architectures, Google's focus on applying large language models (LLMs) to software development has led to the popular adoption of coding capabilities such as inline code completion.

With this approach, Google reports that AI-driven code completion uses natural developer workflows, leveraging extensive historical data to enhance model performance. Acceptance rates for AI suggestions are high, with significant portions of new code characters now generated by AI. Other AI applications include resolving code review comments, adapting pasted code, and predicting fixes to build failures. These tools have become integral to Google's software development process, offering substantial productivity benefits by reducing the time developers spend on routine coding tasks.

Looking ahead, Google aims to integrate the latest foundation models, such as the Gemini series, into its development tools. Future improvements are expected in areas like code testing, understanding, and maintenance. The industry is also shifting towards using natural language as an interface for software tasks, and ML-based automation for larger tasks is gaining traction. To advance these capabilities, the software engineering community is encouraged to develop common benchmarks for a broader range of tasks beyond code generation, including debugging and code migrations, to support ongoing innovation and practical applications in the field.

Toward a World of AI-driven Software Engineering?

The integration of AI into software engineering practices is reshaping the industry, offering powerful tools that enhance productivity, quality, and maintainability. From generative AI for code creation to automated testing, legacy code translation, documentation generation, and the enforcement of coding standards, AI is becoming an indispensable ally for software developers.

As AI technologies continue to evolve, their application in software engineering will undoubtedly expand, driving further innovation and efficiency. As we see with Google’s software engineers, they are expanding their use of AI beyond basic coding tasks toward many other areas of the software lifecycle. As they do so, there is much we will learn about where and how to apply AI effectively.

However, increasing use of AI in software engineering is also causing concern. Where can it be applied most effectively to add value? Where should it be avoided? How can we ensure the human aspects of the craft of software engineering are not lost? Lets hope that we can use AI responsibly without software losing its soul.