Weekly Cloud Info #W47 - 2024

Hi!

This week's cloud highlights: Kubernetes Kueue’s MultiKueue feature now supports cross-cluster job scheduling for efficient AI and HPC workloads. Amazon S3 introduces object appending for seamless updates. CNCF publishes guidance on open-source maturity with insights into multicluster and AI/ML tools. Mistral AI’s Le Chat gets upgrades, including web search and task automation, while AWS enhances VPC security with Block Public Access.

Have a great read.

📰 Top picks of the week

Amazon Aurora Serverless v2 supports scaling to zero capacity

Amazon Aurora Serverless v2 now supports scaling down to 0 Aurora Capacity Units (ACUs), allowing databases to pause during inactivity and resume on-demand. This feature saves costs by eliminating idle capacity, available for Aurora PostgreSQL and Aurora MySQL supported versions. Users can configure it easily in the AWS Management Console for new or existing clusters. Aurora Serverless v2 ensures precise scaling to meet application demands efficiently.

AWS Application Load Balancer introduces header modification for enhanced traffic control and security

AWS Application Load Balancer (ALB) now supports HTTP header modification, enabling centralized traffic management and security without altering application code. Features include renaming load balancer-generated TLS headers, inserting custom headers (e.g., HSTS and CORS), and disabling the "Server" response header for improved security. Available in all AWS Regions, this feature can be configured via APIs, CLI, or the Management Console. Explore ALB documentation for details.

Centrally managing root access for customers using AWS Organizations

AWS Identity and Access Management (IAM) now enables centralized management of root credentials and sessions across AWS Organizations. Security teams can eliminate long-term root credentials, enforce secure-by-default accounts, and gain short-term task-scoped root access for specific actions like unlocking S3 bucket policies. Available in all AWS Regions (excluding GovCloud and China), this feature simplifies root management, reduces security risks, and aligns with AWS best practices.

Enhancing VPC Security with Amazon VPC Block Public Access

AWS launches Amazon VPC Block Public Access, a declarative feature to block inbound and outbound internet traffic at the VPC level. It supports bidirectional blocks or ingress-only blocks, allowing secure egress through NAT Gateways and EIGWs. Granular exclusions let users permit internet access for specific subnets or VPCs as needed. Now available across all AWS Regions, this feature strengthens compliance and simplifies security management. Learn more in the Amazon VPC Block Public Access documentation.

Amazon S3 Express One Zone now supports the ability to append data to an object

Amazon S3 Express One Zone now allows applications to append data directly to existing objects, simplifying workflows for use cases like log processing and media broadcasting. This eliminates the need for intermediate local storage, enabling real-time updates and reads within S3. Available across all supported AWS Regions, the feature can be accessed via AWS SDK, CLI, or Mountpoint for Amazon S3. Learn more in the S3 User Guide.

65,000 nodes and counting: Google Kubernetes Engine is ready for trillion-parameter AI models

Google Cloud introduces support for 65,000-node clusters in GKE, enabling the scale needed for next-gen AI training and inference.

This scale supports over 250,000 accelerators per cluster and reduces model training time for multi-trillion parameter AI models. Innovations include a Spanner-based key-value store for enhanced reliability and speed. Customers like Anthropic are leveraging GKE’s expanded capacity to accelerate AI advancements.

Google Cloud Translation AI now covering 189 languages

Google Cloud introduces Translation AI in Vertex AI, offering unmatched translation quality with adaptive and customizable features for businesses.

Supporting 189 languages, it bridges the gap between speed and nuance, helping brands connect authentically with global audiences. Uber and other enterprises leverage these tools for seamless multilingual experiences.

Mistral unveils new AI models and chat features

Le Chat, Mistral AI's free generative AI assistant, now includes advanced capabilities like web search with citations, document and image analysis, and a new Canvas interface for collaborative ideation.

Powered by the multimodal Pixtral Large model, it processes PDFs, generates high-quality images, and supports task automation with customizable agents. Currently in free beta, Le Chat offers a comprehensive platform for work and creativity. Learn more about its innovative tools and features.

Kueue Can Now Schedule Kubernetes Batch Jobs Across Clusters

The Kubernetes batch scheduler Kueue now supports MultiKueue, a beta feature enabling job dispatch across multiple clusters. This enhancement streamlines operations and expands computational resource options for AI and HPC workloads.

Organizations like CERN are leveraging MultiKueue to manage jobs across on-premises, cloud, and HPC environments. MultiKueue ensures jobs are executed on clusters with sufficient capacity, making workload placement seamless and efficient.

CNCF Releases Guidance on Open-Source Maturity and Reference Architectures

The Cloud Native Computing Foundation (CNCF) has published a report offering guidance on assessing open-source software maturity and reference architectures for multicluster application management and AI/ML use cases.

The report highlights technologies like Cilium (+47 maturity score for multicluster apps) and Apache Airflow (+51 for AI/ML). Initial case studies include Allianz Direct's platform engineering and Adobe's cell-based architecture. Developed by the CNCF End User Community, the insights aim to accelerate reliable production deployments of cloud-native tools.

❤️ You might also like

  • Autoflow, a Graph RAG based and conversational knowledge base tool LINK

  • The Data Engineering Handbook LINK

  • Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference LINK

  • DOJ will push Google to sell off Chrome LINK

  • Daisy, an AI granny wasting scammers time LINK

🎁 This week hidden gem

Join AWS Associate Certification Challenge and get 50% discount voucher LINK

Send “GIFT” to [email protected] to receive all of this month's hidden gems within next 2 hours.

🏁 Enjoy this newsletter?

Forward it to a friend, and let them know they can subscribe here.