- Thailand
While the use of cloud services is becoming more common, incidents caused by misconfigurations and inadequate permission management continue to occur. Unlike on-premise systems, cloud services are based on a "shared responsibility model" in which roles are divided between the provider and the user, making it often difficult to determine to what extent basic security measures should be implemented in-house.
This article systematically explains the minimum security measures you need to take in a cloud environment by risk. In addition to basic measures such as access management, network control, encryption, and log management, it also provides practical tips common to all major clouds (AWS, Azure, GCP), as well as examples of common mistakes in design and operation, providing content that is immediately useful in the field.
While cloud usage is becoming more common, there are still many incidents caused by user-side deficiencies, such as misconfigurations, excessive access rights, and incorrect settings for network access ranges. While the infrastructure provided by cloud providers is very robust, there are many areas that users must configure and operate properly, which is a distinctive feature of the system, making it a likely source of risk. Caution is required during the initial implementation and expansion of the environment, as it is easy to overlook necessary measures.
Major cloud services such as AWS, Google Cloud, and Microsoft Azure operate under a "shared responsibility model," where the cloud provider is responsible for protecting the physical infrastructure and virtualization layer, while the user must properly manage account management, network settings, access permissions, data protection, and other aspects.
Common configuration mistakes
Storage and virtual servers that are supposed to be private are exposed to the Internet
Unintentionally leaving a publicly accessible database
Applying loose settings from the test environment to the production environment
The application of security patches is left to the cloud provider, leaving vulnerabilities unattended
In a cloud environment, all operations are linked to accounts (IDs) and permissions. Inadequate ID management and permission design pose risks to external attacks, internal fraud, and operational errors.
Common issues
Broadly granting administrative privileges to personal accounts
MFA (Multi-Factor Authentication) is not enabled
Accounts and privileges of employees who have been transferred or left the company remain in operation.
The access key has not been rotated for a long period of time.
The uses of service accounts and roles are not organized, making it difficult to inventory them.
In the cloud, communication routes are controlled by combining virtual networks, security groups, firewall rules, etc. If these are poorly designed or configured, areas may be unintentionally exposed to the outside world, making them more susceptible to attackers.
Typical misconfiguration examples
Allowing "0.0.0.0/0" broadly in security groups
Exposing management ports such as SSH and RDP directly to the Internet
Internal resources and external public resources are mixed on the same network
Communication rules whose purpose is unclear remain in place for a long period of time
Cloud storage and databases offer high availability and reliability, but if encryption and backup operations are not properly designed, there remains a risk of information leaks and data loss.
Common issues
Encryption at rest is disabled during operation
Communication path encryption (TLS) is not thoroughly implemented
Backups are concentrated in a single region or account, reducing failure tolerance
Backups have been taken, but the restore procedure has not been verified.
The cloud contains a variety of logs, including API calls, management console operations, network flows, etc. If these logs are not collected, stored, and analyzed properly, suspicious behavior cannot be detected quickly, delaying incident response.
Representative issues
Administrative Actions Logging is not enabled or its retention period is too short
Although logs are being accumulated, they are not being utilized due to the lack of a system for analyzing and visualizing them.
Alert accuracy is low, and there are many false positives, leading to operations staff ignoring them.
Log formats are not standardized in multi-cloud environments, making cross-sectional analysis difficult
The security of a cloud environment depends on the user's configuration and operation. The basic measures that must be consistently observed from the initial construction to the operation phase are the same for all clouds. We will summarize the essential measures that cloud users should keep in mind at a minimum.
The most important measure in the cloud is thorough identity and access management. Since all operations are managed by accounts (IDs) and roles, weak permission management will directly lead to vulnerabilities in the entire system.
Minimum points to be implemented
Ensuring the principle of least privilege
Give users and applications only the minimum privileges they need and eliminate overly broad roles.
Enforced MFA (Multi-Factor Authentication)
MFA is applied not only to administrator accounts, but also to general users and service accounts, reducing the risk of password leaks.
Password policy enforcement
By setting strength requirements (length and complexity) and expiration dates, you can build an account infrastructure that can withstand guessing and brute force attacks.
Access key rotation and usage restriction
Access keys that are reused for long periods of time are vulnerable to attackers, so they should be rotated or abolished regularly.
Cloud network design is based on minimizing the attack surface, avoiding unnecessary exposure and broad permission scope, and clearly controlling communication paths.
Key Points
Appropriate division of VPC
Separate production, development, and testing environments to prevent the mixing of workloads of different importance.
Appropriate security group restrictions
Do not use "0.0.0.0/0" permission lightly, but open only the necessary IP ranges and ports.
Firewall perimeter defense
Utilizing the firewall function provided by the cloud, we strictly manage communication paths for externally published resources.
Implementing a Web Application Firewall (WAF)
Enable WAF to prevent application layer attacks such as SQL injection and XSS.
In the cloud, the risk of information leakage and data loss depends on the data protection design, so encryption and backups are essential.
Enabling encryption at rest
Encrypt all data at rest, including Amazon S3, Amazon EC2 EBS volumes, and Amazon RDS.
Encryption Key Management (KMS)
Leverage AWS Key Management Service for key rotation, access control, and auditing.
Thorough in-transit encryption
TLS communication is enforced and all communication, including between applications, is encrypted.
Backup multiplexing
Maintain backups in different accounts and regions to protect against failures, accidental deletion, and ransomware.
Insufficient monitoring can lead to attacks and internal fraud going undetected, resulting in greater damage. The cloud combines logs and threat detection services to enable early detection.
Log collection and centralized management
It comprehensively collects logs such as API calls, operation history, and network flows.
Designing alert rules
Set rules to receive immediate notifications for critical events (such as permission changes or suspicious access).
Utilizing threat detection services
With AWS, you can use Amazon GuardDuty and AWS Security Hub to continuously detect threats.
Continuous visibility and analytics
The aggregated logs are displayed on a dashboard, and operational status is reviewed periodically.
Because configuration changes occur frequently in cloud environments, a mechanism for continuously verifying status is essential.
Introducing Cloud Security Posture Management (CSPM)
Utilizing AWS Security Hub and other tools, we automatically detect best practice compliance and misconfigurations.
Continuous patch application
We will continue to patch areas that are the customer's responsibility under the shared responsibility model, such as Amazon EC2, containers, serverless, and middleware.
Configuration Drift Detection
Detects manual changes or unexpected configuration differences and restores the original configuration.
Regular vulnerability scans
We scan the OS, libraries, and container images for vulnerabilities and prioritize any critical issues.
Although AWS, Google Cloud, and Microsoft Azure have different architectures and function names, they share the same principles for enhancing security. The five areas of ID management, network, data protection, logging, and asset management are highly effective in all clouds and are important regardless of the scale of deployment.
In the cloud, IAM (Identity and Access Management) is a critical area. Incorrect authorization or insufficient inventory can lead directly to serious incidents.
Key points to practice
Adhere to least privilege design
Avoid unnecessary administrative privileges and broad roles, and assign limited privileges according to the purpose of use.
Regularly take inventory of accounts and roles
We will promptly delete accounts of employees who have left or been transferred, as well as unused service accounts.
Clarify the purpose of service accounts
Organize by purpose, such as people, systems, or batches, and do not leave any unnecessary permissions or access keys.
Clarifying rules for using privileged accounts
With AWS, you can use AWS IAM Identity Center and role switching to operate without using administrative privileges on a daily basis.
In the cloud, encrypting storage and databases and checking disclosure settings are fundamental to preventing information leaks.
Encrypt all data at rest
Enable encryption for Amazon S3, Amazon RDS, Amazon EBS, etc.
Regularly check your privacy settings (Public/Private)
Use CSPM tools and automated checks to ensure there are no unintended public access or sharing settings remaining.
Dual access control
We combine IAM policies, VPC endpoints, and network settings to clarify access routes.
Establishment of key management (KMS, etc.)
We ensure key rotation, access control, and audit logging.
It is effective to design cloud networks with a zero trust philosophy (not assuming trust).
Minimize Internet Exposure
Only expose the bare minimum that is truly necessary, such as web applications.
Prefer private connections
Use AWS PrivateLink, VPN, or dedicated lines to select a configuration that does not route internal communications via the Internet.
Isolation of critical resources
Isolate management servers, databases, batch servers, externally-facing applications, etc. at the network level.
Adoption of Bastion and Zero Trust Access
We will strictly enforce operations that do not expose SSH/RDP for management purposes directly to the Internet.
In the cloud, there are many types of logs, and the formats vary depending on the service. If left as is, this increases the analysis load and can lead to delays in incident detection.
Collect and standardize logs centrally
For AWS, it is aggregated into Amazon CloudWatch and AWS CloudTrail, and for Azure, it is aggregated into Azure Monitor, etc.
Integrate monitoring and alerting
We will establish a system to notify you immediately of any critical events (such as changes in permissions, suspicious access, or changes in the scope of disclosure).
Use threat detection services together
For AWS, use Amazon GuardDuty, for Azure, use Microsoft Defender for Cloud, and for GCP, use Security Command Center.
Prepare a visualization dashboard
Creating a screen that can be checked on a daily basis prevents operations from becoming dependent on one individual.
Cloud resources increase at a rapid pace, and if not managed properly, it becomes difficult to know the location and use of assets.
Unification of naming conventions
Set rules including project name, environment (prod/dev/test), role, etc.
Thorough tagging
Tags clearly indicate the owner, purpose, cost center, confidentiality classification, etc., making searching and inventory easier.
Tag-based auditing and visibility
Use tag information for cost analysis, security audits, automated remediation flows, and more.
Applying common rules to multi-cloud environments
By aligning tags and naming conventions across AWS/GCP/Azure, cross-site management becomes easier.
Major cloud services come standard with dedicated services that complement security measures and automatically detect misconfigurations and threats. By combining these, you can cover areas that are difficult to cover through manual operations and improve your overall security level.
The core of a cloud environment is a service that manages IDs and access rights.
AWS IAM (AWS Identity and Access Management)
Manage users, groups, roles, and policies, and set fine-grained permissions. Use it in conjunction with AWS IAM Identity Center to support SSO and integrated authentication.
Azure Active Directory (Microsoft Entra ID)
It is strong in authentication, authorization, SSO, and enterprise application integration, and can build a centralized ID infrastructure by integrating with Microsoft 365.
Google Cloud IAM (Identity and Access Management)
It features detailed role-based permission management and can also be integrated with Google Workspace.
When to use it
Authentication and authorization integration
Implementing least privilege
Enforcement of SSO and MFA
Service Account Management
For network layer defense, we use the WAF, firewall, and dedicated connection services provided by each cloud.
AWS WAF (Web Application Firewall)
Prevents attacks at the web application layer, such as SQL injection and XSS.
AWS Firewall Manager
You can centrally manage settings for security groups, AWS WAF, AWS Shield, and more, and apply policies across your organization.
AWS PrivateLink
Services and VPCs can be connected via a private network, allowing communication without going through the Internet.
When to use it
Protecting Externally-Facing Applications
Closing communications that do not need to go via the Internet
Centralized management of network security
This is a service that detects suspicious behavior and potential threats occurring within the cloud.
Amazon Guard Duty
Analyzes logs, VPC flow logs, and DNS queries to detect signs of malware infection or unauthorized access.
AWS Security Hub
The detection results of each service are integrated to provide a comprehensive view of compliance with best practices.
Security Command Center (Google Cloud)
Centralize threat detection and configuration risk management across GCP.
When to use it
Threat detection in the cloud
Early detection of suspicious operations
Security Alert Integration
Log collection and auditing are essential foundations for incident response.
Amazon Cloud Watch
An integrated monitoring service that monitors metrics, collects logs, and sets alarms.
AWS Cloud Trail
You can record API calls and admin console operations to track change and access histories.
Azure Monitor / Log Analytics
You can centrally manage metrics and logs for Azure resources.
Google Cloud Logging/Cloud Monitoring
It can be used as a log monitoring platform for the entire GCP.
When to use it
Acquisition and storage of audit logs
Tracking API operation history
System status visualization and alert settings
Because configuration changes occur frequently in cloud environments, a service that automatically detects misconfigurations and configuration drift is essential.
AWS Security Hub
It checks for compliance with best practices such as CIS benchmarks and automatically detects abnormal configurations.
Microsoft Defender for Cloud
It provides comprehensive configuration assessment, vulnerability diagnosis, and threat detection for Azure resources.
Security Command Center (GCP)
Centrally manage configuration risks and threats in GCP.
When to use it
Use as CSPM (Cloud Security Posture Management)
Early detection of configuration errors
Standardized configuration and automated checks
Cloud security differs more in the "design stage" than in the implementation stage. Proceeding with construction with an unclear design can lead to configurations that are difficult to correct later, or to unintended disclosure or expansion of privileges.
In the cloud, the basic design is based on the assumptions of "least privilege" and "zero trust."
Enforcing Least Privilege
Grant only the minimum permissions required to all entities, including applications, users, batch processes, service accounts, etc. Avoid broad permissions in IAM policies and instead create dedicated roles.
Implementing Zero Trust
This is the idea of verifying all access. Even if the network is internal, it is important to think that "authentication and authorization are always required" and not to create unnecessary communication paths.
Strengthen identity-based protection rather than network perimeter protection
Instead of traditional perimeter defense, it provides protection at multiple layers, including IAM policies, security groups, and service-to-service authorization.
To properly design cloud data protection, the protection level is classified according to the "importance" and "purpose of use" of the data.
Data classification implementation
For example, you can classify data as "confidential," "internal," "public," or "archived," and set an appropriate protection level for each type of data.
Applying controls according to protection level
Sensitive data: Strict encryption key management, access via PrivateLink, and mandatory access logs
Internal data: Access classification by IAM, network isolation
Public data: Preventing incorrect public settings, applying WAF
Archive: Durability-focused storage lifecycle management
Identifying access patterns
Define who will use it, where they will be from, and how often, and apply least privilege accordingly.
In cloud environments, storage, databases, APIs, etc. are easily exposed to the public by mistake, which often leads to incidents. It is important to clarify "disclosure standards" at the design stage.
Clarifying standards for external disclosure
Clarify the criteria for your decision: "Do general users need access?" and "Is it sufficient for administrators and systems only?"
Internal disclosure is the norm, and external disclosure is treated as an exception.
Setting a rule that the general rule is Private and the exception is Public prevents incorrect configuration.
Fixing communication routes
With AWS, you can limit accessible routes by using AWS PrivateLink, VPC endpoints, security groups, etc.
Automatic publishing setting check
Continuously detect exposure misconfigurations with CSPM (e.g., AWS Security Hub, Microsoft Defender for Cloud).
Even in a cloud environment, there is a risk of data loss due to failures, accidental deletion, ransomware, etc., so backup and DR (Disaster Recovery) must be incorporated from the design stage.
Defining RTO/RPO
RTO (Recovery Time Objective): How quickly do you need to recover?
RPO (Recovery Point Objective): How much data can you afford to lose?
Backup multiplexing
Keep backups in different accounts and regions to prevent data loss due to a single failure.
Selection of DR method
Warm Standby
Pilot light
コールドスタンバイ
Multi-region configuration
Verification of recovery procedures (DR testing)
Even if you have a backup, there is no guarantee that you will actually be able to recover, so be sure to conduct regular DR tests.
The security level of a cloud environment changes during operation after construction. A characteristic of the cloud is that resources are frequently added and changed, so even if the environment is secure at the time of design, it is prone to configuration degradation over time. During the operation phase, critical areas are continuously checked and improved.
IAM is the area most susceptible to degradation in cloud operations. The accumulation of permission additions, exception handling, and temporary configuration changes can lead to an increase in accounts with overly broad permissions and unknown uses.
Points to check regularly
Delete unnecessary accounts and roles
Securely delete accounts of employees who have left or been transferred, as well as unused roles.
Checking access key usage
Disable keys that have not been used for a long time or that show signs of external use.
Check privileged account usage history
Regularly check the number of times the administrator role is used and who is using it to prevent abuse.
Reviewing policies to ensure least privilege
We will streamline the permissions that have expanded in past operations and narrow them down to only the permissions that are truly necessary.
In a cloud-native environment, containers and serverless functions are updated frequently, making it easy for vulnerabilities to remain in dependent libraries and container images.
Important points to be implemented during operation
Container image vulnerability scanning
Detect images with critical vulnerabilities by utilizing features such as the Amazon Elastic Container Registry (Amazon ECR) scanning feature.
Check for updates to dependent packages
Regularly check for updates to the libraries used in serverless.
CI/CD and scanning integration
We will create a flow that automatically scans during build and stops deployment if there are any vulnerabilities.
Runtime vulnerability monitoring
Continuously checks the execution environment for vulnerabilities (OS, runtime, framework).
Simply collecting logs is not enough to detect suspicious or abnormal behavior. In the operational phase, the emphasis must be on "making it visible" and "accurately notifying."
What to do specifically
Development of a log visualization dashboard
Use Amazon CloudWatch and Azure Monitor to list key metrics and events.
Reducing alert noise
For alerts with many false positives, we optimize conditions and adjust thresholds to ensure important notifications are not buried.
Prioritizing Critical Events
Prioritize events that require immediate attention, such as permission changes, changes to visibility settings, and API key abuse.
Continued monitoring of threat detection services
Regularly review and improve Amazon GuardDuty and Microsoft Defender for Cloud detections.
In cloud environments, manual changes and temporary solutions can easily cause "configuration drift," which can lead to serious configuration errors. If configuration drift is left unchecked, the state of things deviating from the original design intent will accumulate, increasing the risk of incidents.
Necessary checks during operation
Continuous checks with CSPM
AWS Security Hub and Microsoft Defender for Cloud automatically detect misconfigurations and deviations.
Detecting differences with IaC
If you use infrastructure as code (e.g., CloudFormation, Terraform), regularly check the diff against your code.
Managing Manual Changes
Any manual changes made by an administrator to the settings should be recorded and verified to ensure they were intended.
Rapid correction of drift when it occurs
Any discrepancies discovered will be corrected as soon as possible, and operational rules for returning to the standard configuration will be clarified.
Cloud computing has many configuration items and a high degree of freedom, so there are many cases where improper design and operation directly lead to incidents. "Omissions" in the initial construction and inadequate operational rules are areas that can easily lead to serious accidents.
Although the cloud is initially secured to a certain degree of security, moving to production operations without any changes poses serious risks. The initial settings are merely a "state for checking the minimum operation" and are different from the security level required for production use.
Common mistakes
Amazon EC2 and Amazon RDS security groups are still broad
Amazon S3 bucket encryption is disabled during operation
Logging (AWS CloudTrail or Amazon CloudWatch Logs) is left unconfigured
AWS IAM MFA settings have not been applied and administrator privileges are being used.
The most common type of cloud incident is the accidental disclosure of storage. A single mistake in disclosing storage can lead to an immediate information leak, so strict checks are required.
Typical examples
Accidentally allowing public access to your Amazon S3 bucket
Static files that do not need to be made public are accessible externally
Application logs and backup data are publicly available
Access control lists (ACLs) and bucket policies become complex and misconfigured.
When IAM operations are personalized, visibility into the entire cloud is lost, making it impossible to know who can do what. Personalized permission management not only delays identifying the cause of an incident, but also increases the risk of internal fraud and operational errors.
Common situations
There are still a lot of roles and policies that haven't been reviewed in years.
There is an account that has been "temporarily" granted administrator privileges
Accounts of employees who have left or been transferred remain untouched
The purpose of the service account has become unclear and it cannot be deleted
Even if logs are output, auditing and threat detection will not function if they are not being monitored. In an environment without monitoring or notification, attacks will not be noticed and post-incident response will not be possible.
Common mistakes
AWS CloudTrail logs are not being stored or are stored for too short a period
Amazon CloudWatch alert settings are not fully developed
There are no alerts regarding suspicious operations or configuration changes
You do not have Amazon GuardDuty or Microsoft Defender for Cloud enabled
Cloud security is not something that "stabilizes once it's implemented"; continuous operation is essential. Cloud environments must be designed with the assumption that they will continue to change. If measures are implemented once and then left alone, vulnerabilities will accumulate over time.
Common mistakes
The original design documents were not updated, and the gap between them and reality became larger.
CSPM (e.g. AWS Security Hub) has been implemented, but the results have not been reflected in operational improvements.
No continuous checks are performed on permissions, network, storage, logs, etc.
Manual changes lead to configuration drift, leading to significant deviations from best practices
Cloud security is not a one-off measure, but a process of continuous maturation. Regardless of the size of your environment or operational structure, you can gradually improve safety and operational efficiency by following the four steps below.
The first step is to "correctly understand the current situation." Without visibility into asset status and configuration details, it is impossible to consider appropriate measures. If the visualization phase is neglected, subsequent standardization and automation will not progress, and operations will become dependent on individual skills.
Points to be particularly addressed
Asset Visibility
It lists which cloud resources exist in which accounts and regions.
If there are any missing tags, this is the time to start working on them.
Scanning the configuration
We use CSPM tools such as AWS Security Hub, Microsoft Defender for Cloud, and Security Command Center to identify misconfigurations.
Enabling and aggregating logs
Be sure to enable infrastructure logs like AWS CloudTrail, Amazon CloudWatch Logs, and Azure Monitor and start storing them for long-term use.
Once the current situation is visualized, the next step is to standardize operational rules through "standardization." Standardization eliminates the situation where different people do things differently, significantly reducing operational burden and configuration errors. As standardization progresses, the management level across the entire team becomes uniform, stabilizing security quality.
Priority areas
Standardizing IAM permissions
Define naming conventions for roles and policies and standardize least privilege templates.
Network configuration standardization
VPCs and subnets are classified by purpose, such as public, internal, and management segments, and configurations are created as templates.
Unification of tag and naming rules
Require tags such as project name, environment, responsible person, and cost center to simplify asset management.
It is difficult to "maintain standardized operations manually alone." Therefore, we automate as much as possible, reducing human error while creating a situation where security measures can be continuously implemented. As automation progresses, security measures become less likely to be overlooked, and a high level of operational quality can be maintained even with a small number of people.
Automation to be implemented
Automated Threat Detection
Amazon GuardDuty and Microsoft Defender for Cloud are always enabled to automatically detect suspicious behavior.
Automatic detection and repair of misconfigurations
Apply automated checks and remediation rules for IAM, network, storage exposure settings, and more.
CI/CD integration
Incorporate vulnerability scanning (such as Amazon ECR scanning) into your deployment pipeline and stop the deployment if there are any issues.
Log aggregation and automated alerts
Alerts are automatically sent to the operations team, accelerating response times when an abnormality occurs.
The final step is "continuous improvement." Cloud environments are constantly changing, so implementing measures once and then abandoning them will result in deterioration. It is important to incorporate an improvement cycle and keep the environment up to date. By achieving continuous improvement, you can ensure that your cloud environment remains safe and operationally efficient over the long term.
What needs to be addressed
Regular reviews
We review IAM permissions, network configuration, public settings, tag operations, etc. on a quarterly basis.
Creation of operational reports
Visualize the number of alerts, detected threats, and the status of correcting misconfigurations, and use these as indicators for improvement.
Regular audits
We use internal and third-party audits to verify that standardization and automation are working properly.
Keeping up with new services
We continually incorporate new security features and best practices to advance the maturity of your environment.
In cloud security, success depends more on a continuous operational model than on one-off measures. By implementing least privilege and zero trust as the foundations, along with managing the scope of disclosure, encryption, log maintenance, and configuration management, the risk of misconfigurations and attacks can be significantly reduced. By proceeding step-by-step from visualization to standardization, automation, and improvement, the level of security can be raised regardless of the size of the environment.