The Dark Side of Generative AI: Part 2 — Mitigating the Security and Privacy Risks

Posted 8 Jan 2024 (15 minute read)

To secure your project, you need a thorough assessment and mitigation of every possible risk. Here is how to approach this mission. Note: This is part 2 of the 3-part series that will explore the pitfalls that developers of the Generative AI-based application need to know and try to mitigate. Part 1 — Understanding the Security and Privacy Risks Part 2 — Uncovering pitfalls and mitigations that affect the Security and Privacy Risks Part 3 — Navigating the Murky Waters of Generative AI Regulation Landscape (Coming) Introduction Have you ever wondered how risk assessment for generative AI applications differs from the risk assessment performed for other applications?

Read more

How Innovation Benefits My Business: Is It a Need, a Privilege, or a Catalyst?

Posted 12 Dec 2023 (9 minute read)

In This Blog Post, We Will Explore the Role of Innovation in Business Success and Ways to Implement it. Introduction You cannot achieve long-term sustainable success by relying solely on your current strategies and technologies. The world is dynamic and rapidly changing, and you need to change and adapt. Innovation is imperative to ensure long-term sustainability, enhance competitiveness, and drive business growth. In this post, we aim to help you understand the importance of innovation in achieving business success.

Read more

Large Language Models are Few-shot Testers

Posted 12 Dec 2023 (5 minute read)

In this post we’ll review the paper Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction. Background Writing test cases for bugs is a critical yet tedious part of software development. How much developer time is spent on this? The authors analyze 300 open source Java projects and find that on average 28% of tests were added as part of a bug fix. Most existing automated test generation tools focus on maximizing code coverage rather than reproducing issues described in natural language [1].

Read more

Beyond Copilot: How AI Will Revolutionize the Software Development Lifecycle

Posted 23 Nov 2023 (6 minute read)

Within a year, coding assistants like GitHub Copilot went from being sci-fi fantasy to an essential tool for developer productivity. Sure, they’re not always helpful, sometimes they’re downright misleading, but they’re getting better every day and traffic to Stack Overflow has been down by an average of 6% every month since January 2022 — make of that what you will. So how is AI being applied to other aspects of the software development lifecycle?

Read more

The Dark Side of Generative AI: Part 1 — Understanding the Security and Privacy Risks

Posted 20 Nov 2023 (4 minute read)

An overview of the main security and privacy considerations developers need to consider when building products that leverage Generative AI. Note: This is part 1 of a 3 part series that will focus on the main security and privacy aspects related to building applications that leverage Generative AI technology The Rise in Popularity of Generative AI Models Generative Artificial Intelligence (AI) has rapidly gained popularity and found its way into various applications, empowering products with its ability to generate realistic content such as images, videos, text, and even music.

Read more

Assessing the Competitiveness of Web3 Sectors: Insights into Capital Concentration

Posted 25 Jun 2023 (4 minute read)

In this post we look at the distribution of capital within different sectors of web3 protocols; E.g. dexes, bridges, lending, derivatives, etc. To gauge the competitiveness of these sectors, we turn to a trusted tool from the world of economics: the Herfindahl-Hirschman Index (HHI). The Herfindahl-Hirschman Index: A Measure of Market Concentration HHI is a measure of the level of competition among firms in a market, or in our case, protocols within a sector.

Read more

The Times covers our AI work with Cabinet Office

Posted 25 Nov 2020 (1 minute read)

Atchai has been working with Cabinet Office in the UK to explore how machine learning and natural language processing technology can be applied to improve organisational knowledge transfer and ultimately to improve policy making. Our work received positive coverage in The Times with a quote from our CTO:

Read more

Reasons for unrest: Using transformers to understand the reasons behind global protests

Posted 18 Nov 2020 (5 minute read)

We have developed a technique to understand the dominant reasons for protests and demonstrations happening around the world. Our work builds on the ACLED dataset, a source that’s regularly quoted in the media and described by the Guardian as being “the most comprehensive database of conflict incidents around the world”. ACLED helpfully collects all political violence and protest events but only provides a text description of the event which makes it hard to cluster together events that have a common theme and ask questions such as “how many BLM protests are happening per week?

Read more

AI will be the judge of that - predictive algorithms in the public sector

Posted 9 Sep 2020 (5 minute read)

Algorithms have been choosing our fate for years. From automated mortgage decisions to dating apps assessing our compatibility with a potential mate, machines are running the show. In spite of this, the controversy over UK exam grades being predicted by an algorithm has led many to question our reliance on computers over human judgement. Many feel troubled by the concept of algorithms assessing whether a criminal is likely to reoffend. However, by helping tackle the deep-seated problem of unconscious bias in the judicial system, could they be working towards the greater good?

Read more

What can data scientists contribute to the COVID-19 effort?

Posted 23 Mar 2020 (4 minute read)

Health care professionals are the real heroes of the moment but there is an important role for data scientists to play in the fight against the pandemic that’s shaking the globe right now. Your first port of call should be the COVID-19 Open Research Dataset — released on 20th March by the White House and a coalition of leading research groups. I recommend heading over to the associated Kaggle competition where you can collaborate and build upon the work of other data scientists who have already made progress on cleaning and understanding the potential of this dataset containing over 29,000 academic papers related to COVID-19, SARS-CoV-2, and other coronaviruses.

Read more