Cyber Security in AI Applications
Artificial Intelligence is the process of building intelligent machines from vast volumes of data. Systems learn from past learning and experiences and perform human-like tasks. It enhances the speed, precision, and effectiveness of human efforts. AI uses complex algorithms and methods to build machines that can make decisions on their own. Machine Learning and Deep learning forms the core of Artificial Intelligence. Some of the essential branches of AI includes:
- Machine learning
- Deep learning
- Natural language processing
- Robotics
- Expert systems
Generative AI: Generative AI is a broad term for AI that can learn from existing artifacts to generate new, realistic artifacts (at scale) that reflect the characteristics of the training data but don’t repeat it. It can produce a variety of novel content, such as images, video, music, speech, text, software code and product designs. Recent breakthroughs in the field, such as GPT (Generative Pre-trained Transformer) and Midjourney, have significantly advanced the capabilities of GenAI. These advancements have opened up new possibilities for using GenAI to solve complex problems, create art, and even assist in scientific research.
Security in AI!
Security in AI is fairly a new thing and it’s still evolving. Artificial Intelligence (AI) implementation undergoes rigorous Security Assessment as per inhouse security best practices including design review, architecture review, code review, static & dynamic assessment. Any finding is then logged in Jira and is then patched by respective stakeholder. Usage of licensing keys and subscription keys is containerised ensuring adherence to the approach of no keys on the wire approach.
Security Review stages:
Azure OpenAI processes the following types of data:
Prompts and generated content. Prompts are submitted by the user, and content is generated by the service, via the completions, chat completions, images and embedding operations.
Augmented data included with prompts. When using the “on your data” feature, the service retrieves relevant data from a configured data store and augments the prompt to produce generations that are grounded with your data.
Training & validation data. You can provide your own training data consisting of prompt-completion pairs for the purposes of fine-tuning an OpenAI model.
AI Privacy:
OpenAI & Azure AI complies with GDPR and CCPA. They can execute a Data Processing Agreement if required.
AI Compliance:
OpenAI & Azure AI is compliant with HIPAA, SOC II, SOC III
Newest ISO Framework for AI: ISO/IEC 23894:2023 — February 2023
Newest NIST Guidleines for AI: Artificial Intelligence Risk Management Framework (AI RMF 1.0) — January 2023
Vulnerability Assessment & Penetration Test:
OpenAI/Azure AI: The OpenAI/Azure AI API undergoes annual third-party penetration testing, which identifies security weaknesses before they can be exploited by malicious actors. Pentest reports & executive summary can be requested from OpenAI or Azure customer portal.
FAQ:
- Does this mean OpenAI store or process your data from ChatGPT?
No OpenAI does not, you can opt out of data collection and model improvement by filling out this form. This is a new feature since OpenAI by default stores and process information sent to ChatGPT for the improvement of their models. However, OpenAI does not recommend ChatGPT for sensitive data (source)
2. Do OpenAI GPT-3/4 APIs use your data for model improvement?
No. OpenAI does not process data submitted on its API to train OpenAI models or improve its offerings. However, it is important to keep in mind, data sent to their APIs are based on servers hosted in the US and OpenAI does store the data you send via API for abuse monitoring purposes for up to 30 days. However, OpenAI allows you to opt out of this monitoring ensuring your data is not stored or processed anywhere. You can opt-out using this form. This means your data lifecycle starts and ends with each API call. Data is sent via the API, the output is returned as a response from the API call. It does not remember or store any data sent in between each API request.
3. Does Azure OpenAI have the same policy?
Yes. Azure OpenAI Service does not process data submitted on its API to train models or improve its offerings. Similar to OpenAI, they do store the data you send via API for abuse monitoring purposes for up to 30 days. Here is an overview of how your data flows:
However, Microsoft allows you to opt out of this monitoring ensuring your data is not stored or processed anywhere. You can opt-out using this form. On top of this, Azure already provides network security through security features such as private networks and endpoints.
4. What about fine-tuning?
In both OpenAI and Azure OpenAI, your fine-tuned model is your own. No one else other than your organisation has access to the files used to train the model or the trained model itself. The files used for fine-tuning could be deleted after training leaving you with just a model which generates an output (completion) based on your prompt, neither of which is stored.
On Azure, we can also get additional flexibility regarding where your data resides by choosing the appropriate region. However, not all regions are available for fine-tuning at the moment, especially outside the US. More details on this can be found here.
Reference & Further Studies: https://rrohitrockss.medium.com/risk-in-ai-security-c69852453dcd