PIPC Releases Guideline on Processing of Personal Information in Developing and Using Generative AI

On August 6, 2025, the Personal Information Protection Commission (the “PIPC”) released the Guideline on Processing of Personal Information in Developing and Using Generative AI (the “Guideline”) (available in Korean, Link).

The Guideline divides the lifecycle of developing and using generative AI into four stages: (i) purpose setting, (ii) strategy setting, (iii) AI training and development, and (iv) system application and management. It provides considerations for service providers developing and using generative AI to protect personal information at each stage. The Guideline also presents examples of investigations and administrative actions related to AI-based personal information processing, as well as examples of innovation support systems, to assist service providers in better understanding these issues.

The PIPC describes the Guideline as a compilation of experiences accumulated through preliminary inspections, regulatory sandboxes, and preliminary adequacy reviews, as well as insights from existing publications, such as the Guideline on Processing Publicly Available Data for AI Development and Services and the AI Privacy Risk Management Model.

Below is a summary of the four stages of developing and using generative AI, including the tasks and considerations for each stage as presented in the Guideline:

Stage

Tasks and Considerations

Purpose Setting

Define specific purposes for generative AI, considering the context of use, intended targets, and technical limitations
Considerations

–	Identify a lawful basis to process personal information in accordance with the purpose

Strategy Setting

Establish key strategies, including methods for developing and using AI, and risk management plans, by assessing the service provider’s information and communications technology capabilities and circumstances
Considerations

–	Conduct privacy impact assessments for Privacy by Design

AI Training and Development

Provide (further) training to the AI with data and fine-tune the model to achieve the intended purpose
Considerations

–	Verify data sources, preprocess the data, and apply pseudonymization or anonymization to prevent data contamination
–	Enhance the AI model with additional safety controls, including fine-tuning and alignment
–	Control access to the AI system and apply input/output filtering

System Application and Management

Apply the generative AI system to the service environment, maintain system performance and stability, and prevent infringement of data subjects’ rights
Considerations

–	Review and document privacy risks through preliminary testing
–	Draft and publish an Acceptable Use Policy
–	Develop a function for reporting personal information infringements and establish measures that ensure data subjects’ rights

Key Considerations in Each Stage of Developing and Using Generative AI

Stage 1: Setting the Purpose of Using Generative AI and Identifying a Lawful Basis for Processing Personal Information

The Guideline states that the lawful basis for processing personal information should be identified for each source, largely classifying the sources as (i) collection of publicly available data and (ii) reuse of data subjects’ personal information.

(1)

Collecting and using publicly available data: The legitimate interest clause under Article 15, Paragraph (1), Item 6 of the Personal Information Protection Act (the “PIPA”) may serve as a lawful basis for the collection and use of personal information. To this end, it is essential to minimize the risk of infringing data subjects’ rights by establishing the legitimacy of the purpose, demonstrating the necessity of processing publicly available data, and implementing the technical, managerial safeguards, as well as measures to ensure data subjects’ rights.^[1]

(2)

Training or developing AI by reusing data subjects’ personal data: As detailed in the table below, the lawful basis for processing such data varies depending on the relevance of the AI training and development to the original purpose of personal data collection, as well as the nature of the personal data.

Case	Details
Where data processing is aimed at improving or enhancing services within the scope of the original purpose of collection	Data processing should be based on consent, contractual necessity, legitimate interest, or another lawful basis for collection.
Where data processing is reasonably relevant to the original purpose of collection	The additional use clause (Article 15, Paragraph (3) of the PIPA) may serve as a lawful basis. Reasonable relevance, predictability, risk of unfair infringement of the interests of data subjects, and safeguards, such as pseudonymization and encryption, should be comprehensively considered. If additional use continues, the criteria for such use should be disclosed in the privacy policy, and the Chief Privacy Officer (“CPO”) should review compliance with such criteria.
Where data is used to develop new services apart from the original purpose of collection	Personal information should be pseudonymized or anonymized, or a new lawful basis is required. Regulatory sandbox may serve as a basis if the use is innovative or serves public interest, provided that certain conditions are met.
Where processing involves processing of sensitive information or unique identification information	Separate consent or a legal basis is required. Personal visual data should be processed only within the scope of the purposes for installing and operating the filming device and for filming.

Stage 2: Establishing Strategies for Generative AI Development, Use and Risk Management

The Guideline classifies generative AI into three types depending on development and use: (i) LLM-as-a-service, which integrates commercial AI services, such as private models, with APIs; (ii) ready-made LLM, which involves further training and development of open-weight pre-trained models; and (iii) self-development, which involves pre-training models from scratch.

Classification	Issues and Considerations
LLM-as-a-Service	Issue: As user data is transmitted and received during the use of the service, it is essential to ensure the security of user data processing. Use enterprise API licenses with robust data protection Include data protection clauses in contracts, such as enterprise terms of use and Data Processing Addendums Confirm overseas transfer of personal information and verify relevant notices and consents
Ready-Made LLM	Issue: Caution is warranted, as sources of personal data used during initial training and lawful basis for processing are unclear. Use models trained on datasets with clearly identified sources and histories Review safeguards applied to the model and implement additional safeguards Apply the latest updates and patches on a regular basis
Self-Development	Issue: The service provider should be fully responsible for the entire process of AI development and use. Identify privacy risk factors at all stages—including pre-training, fine-tuning, distribution, operation, and follow-up management—and establish risk mitigation measures

Stage 3: Safeguards at the Stage of Training and Development of Generative AI

The Guideline recommends privacy safeguards that can be considered at the data, model, and system levels in the AI training and development process, including: (i) addressing data contamination, bias and inaccuracy, performing data preprocessing, such as pseudonymization and anonymization, and implementing privacy-enhancing technologies at the data level; (ii) fine-tuning and protecting against hostile attacks (intended to extract personal data) at the model level; (iii) implementing API access controls and filtering at the system level, and (iv) continuously assessing these safeguards and making improvements based on such assessments.

Stage 4: System Application and Management, Such As Monitoring Infringement of Data Subjects’ Rights

The Guideline requires measures such as assessing and documenting the privacy risks of the AI system after development but before distribution, and drafting and publishing an Acceptable Usage Policy clariyfing the intended purpose and relevant prohibitions.

Regarding data subjects’ rights, the Guideline states that if the exercise of data subjects’ rights prescribed in the PIPA (e.g., the right to request access, correction, deletion, and suspension of processing) cannot be guaranteed due to the AI service’s training dataset size or structural issues, the data controller may inform the data subjects of these limitations and their reasons, and provide alternative means.

If a data controller makes a decision about data subjects using a generative AI system, it must determine whether such decision constitutes an automated decision under the PIPA, and ensure the data subjects’ rights, including the right to refuse, request an explanation, and request a review.

Lastly, a data controller is required to support the exercise of data subjects’ rights by disclosing the process of the AI system’s personal information processing—such as dataset collection, key sources, and processing purposes—in its privacy policy, technical documentation, or FAQs.

Establishing AI Privacy Governance

The Guideline emphasizes the need for companies and organizations to establish and operate an internal management system led by the CPO, who oversees compliance with privacy laws and risk management, as management of risks associated with data processing by generative AI becomes increasingly important.

By establishing such governance, the CPO will be able to oversee and manage the entire process—from defining the objectives of generative AI to its implementation and management—ensuring the legality and security of personal information processing. The key processes presented in the Guideline, grounded in AI privacy governance, are as follows.

Continuous privacy risk assessment using assessment tools, such as privacy impact assessment
Multi-layered safeguards to mitigate privacy risks
Systematic documentation of privacy risk management policies
Monitoring, assessment, and reporting of vulnerabilities in personal information
Support for data subjects’ exercise of their rights

Implications of the Guideline and Prospects

The Guideline is significant in that the PIPC systematically compiled and presented key safeguards for generative AI service providers at each stage of generative AI development and use, based on the PIPC’s other guidelines and accumulated enforcement cases and policies related to AI services.

While primarily focused on language model-based generative AI, the Guideline is expected to gradually expand to multimodal and agentic AI, processing various types of information, such as voice, images, and video, making close monitoring of upcoming guidelines important.

The Guideline places special emphasis on establishing AI privacy governance to address legal risks associated with generative AI. Accordingly, companies using publicly available personal information or previously collected personal information for service development are advised to map and assess risks that may arise at each stage of development and use and to establish and implement risk management policies and measures under the leadership of the CPO.

^[1] For more information regarding the processing of publicly available data, please refer to the Guideline on Processing Publicly Available Data for AI Development and Services, released by the PIPC on July 17, 2024, and our newsletter (Link).

[Korean Version]

Share

Professionals

Professionals