Swami Sivasubramanian

Swami Sivasubramanian Swami Sivasubramanian is an influencer

VP, AWS Agentic AI

Greater Seattle Area
135K followers 500+ connections

About

I run AI and Data services in AWS.

I have been awarded (or filed for) more than 250 patents, authored around 40 referred scientific papers and journals, and participate in several academic circles and conferences. In addition to these, I was part of the team that built several AWS Services like CloudFront, Amazon RDS, Amazon S3, Amazon's Paxos based lock service, original Amazon Dynamo etc. I was also one of the main authors for Amazon Dynamo paper (http://bit.ly/1mDs0Yh) along with Werner Vogels. Amazon Dynamo now is the foundation for many other NoSQL systems like Riak, Cassandra and Voldemort.

Articles by Swami

  • 11 takeaways from 2024

    11 takeaways from 2024

    As we close out the year, I want to shoutout the AWS team’s awesome work over the last 12 months. They’ve delivered…

    16 Comments
  • Chief Data Officer Insights on Generative AI and Data Strategy

    Chief Data Officer Insights on Generative AI and Data Strategy

    I’ve often said that data is the genesis for modern invention. It only takes one groundbreaking invention—one iconic…

    18 Comments
  • Why AWS is investing in a zero-ETL future

    Why AWS is investing in a zero-ETL future

    Data is at the center of every application, process, and business decision. When data is used to improve customer…

    18 Comments

Activity

Join now to see all activity

Experience

  • Amazon Web Services (AWS)

    Amazon Web Services (AWS)

    3 years 3 months

    • Amazon Web Services (AWS) Graphic

      VP, AWS Agentic AI

      Amazon Web Services (AWS)

      - Present 1 month

      Seattle, Washington, United States

    • Amazon Web Services (AWS) Graphic

      VP, AI and Data

      Amazon Web Services (AWS)

      - 3 years 3 months

      Seattle, Washington, United States

      I manage our database, analytics and machine learning services in AWS. I am excited about enabling developers, data scientists and businesses "to put their data to work" by bring state-of-the-art technologies in databases to handle transactional data, making it easy to analyze and visualize them, and then finally derive insights powered by machine learning.

      Databases: https://aws.amazon.com/products/databases/,
      Analytics:…

      I manage our database, analytics and machine learning services in AWS. I am excited about enabling developers, data scientists and businesses "to put their data to work" by bring state-of-the-art technologies in databases to handle transactional data, making it easy to analyze and visualize them, and then finally derive insights powered by machine learning.

      Databases: https://aws.amazon.com/products/databases/,
      Analytics: https://aws.amazon.com/big-data/datalakes-and-analytics/
      Machine learning: https://aws.amazon.com/machine-learning/.

      I am hiring VPs, Directors, Product managers, Principal Engineers, and ML scientists for various roles. So, please send me an email to [email protected] if you are interested.

  • Committee Member

    National Artificial Intelligence Advisory Committee

    - Present 2 years 11 months

    The National AI Advisory Committee (NAIAC) advises the U.S. President and the White House on the state of science around AI and U.S. AI workforce & competitiveness.

  • VP, Amazon AI

    Amazon Web Services

    - 5 years 1 month

    Seattle

    My team’s mission is “To put machine learning capabilities in the hands on every developer and data scientist.” My organization works on all aspects of machine learning: right from ML frameworks and infrastructure, Amazon SageMaker (an end to end service for building, training and deploying ML models in the cloud and at the edge), and AI services (Transcribe, Translate, Personalize, Forecast, Rekognition etc) that make it easier for app developers to incorporate ML into their apps with no ML…

    My team’s mission is “To put machine learning capabilities in the hands on every developer and data scientist.” My organization works on all aspects of machine learning: right from ML frameworks and infrastructure, Amazon SageMaker (an end to end service for building, training and deploying ML models in the cloud and at the edge), and AI services (Transcribe, Translate, Personalize, Forecast, Rekognition etc) that make it easier for app developers to incorporate ML into their apps with no ML experience required.

    Also, see other coverage on reinvent ML launches: https://press.aboutamazon.com/news-releases/news-release-details/amazon-web-services-announces-13-new-machine-learning-services

  • Amazon Web Services

    Amazon Web Services

    5 years 8 months

    • Amazon Web Services Graphic

      General Manager, NoSQL and Analytics

      Amazon Web Services

      - 3 years 9 months

      Greater Seattle Area

      I was the GM for NoSQL(Dynamo, ElastiCache) and Analytics (QuickSight) in Amazon. I bootstrapped the NoSQL services team for Amazon from scratch and got DynamoDB and ElastiCache to be two of the key pillar services for AWS with broad success in adoption and revenue. I was responsible for the vision, execution and operations of these services. I managed the engineering, product management and operations for core AWS database services that are the foundational building blocks for AWS: DynamoDB…

      I was the GM for NoSQL(Dynamo, ElastiCache) and Analytics (QuickSight) in Amazon. I bootstrapped the NoSQL services team for Amazon from scratch and got DynamoDB and ElastiCache to be two of the key pillar services for AWS with broad success in adoption and revenue. I was responsible for the vision, execution and operations of these services. I managed the engineering, product management and operations for core AWS database services that are the foundational building blocks for AWS: DynamoDB, AWS Transactional Services (Paxos based lock services, replication infrastructure), Elasticache (in-memory engines), SimpleDB, and a few other services that were in the works.

      I have built several large scale systems in the past. Some of the well known ones (that is externally visible) include Amazon Dynamo and DynamoDB, Amazon CloudFront, Amazon RDS, Amazon ElastiCache, Amazon's core Paxos infra and Amazon SNS Mobile Push. I has also built other large scale systems that is used for building Amazon service infrastructure including its core distributed lock service infrastructure. I am also one of the main authors of Amazon Dynamo paper (along with Werner Vogels) that was published in SOSP 2007 and became the foundation of several other NoSQL stores.

      For more information on DynamoDB, see:
      http://www.youtube.com/watch?v=oz-7wJJ9HZ0
      and
      http://www.youtube.com/watch?v=wjJ3Dkl6VS4

      For Elasticache, see: http://aws.amazon.com/elasticache/

      To build a 1TB cache in a matter of minutes, use Elasticache: http://www.youtube.com/watch?v=4FQAtbgXrdY

      For QuickSight, see http://aws.amazon.com/quicksight


      Product pages for services I managed:

      http://aws.amazon.com/dynamodb/
      http://aws.amazon.com/simpledb/
      http://aws.amazon.com/elasticache/
      http://aws.amazon.com/quicksight

    • Amazon Web Services Graphic

      Sr. Manager, NoSQL

      Amazon Web Services

      - 1 year 11 months

      Switching from a Principal Engineer (who hands on built several AWS services) to engineering leadership, I bootstrapped NoSQL database services in AWS which started the whole DynamoDB, Core Paxos fabric layer and our ElastiCache layer.

  • Amazon.com

    Amazon.com

    3 years 7 months

    • Amazon.com Graphic

      Principal Engineer in Cloud Computing Services

      Amazon.com

      - 2 years 2 months

      Reporting to the CTO; Working on design, implementation and analysis of several distributed systems technology within my organization.

      Launched multiple cloud services like AWS CloudFront, Amazon RDS and influenced launch of several other AWS services.

    • Amazon.com, CTO Office, Distributed SystemsTechnology Graphic

      Senior Research Engineer

      Amazon.com, CTO Office, Distributed SystemsTechnology

      - 1 year 6 months

      Reporting to the CTO; Working on design, implementation and analysis of several distributed systems technology within my organization.

  • Amazon.com, Distributed Systems Group Graphic

    Research Engineer Intern

    Amazon.com, Distributed Systems Group

    - 4 months

    Worked on design and implementation of some scalable distributed systems. The work has led to couple of patent applications and the real systems will go to production environment pretty soon.

  • IBM T J Watson Research Lab Graphic

    Software Research Co-op

    IBM T J Watson Research Lab

    - 4 months

    Worked on design and implementation of a new Grid system called Melody that combined the approaches of virtualized execution environments with P2P technologies to build a scalable distributed system platform for executing Grid jobs. Work led to a patent and couple of publications in premium distributed systems conferences.

  • IBM T J Watson Research Lab Graphic

    Software Research Co-op

    IBM T J Watson Research Lab

    - 4 months

    Worked on design and implementation of a new Grid system called Harmony and Melody that combined the approaches of virtualized execution environments with P2P technologies to build a scalable distributed system platform for executing Grid jobs.
    The project has led to two patent inventions and products are out in the pipeline.

  • IBM Linux Kernel Development Labs Graphic

    Software Intern

    IBM Linux Kernel Development Labs

    - 4 months

    Worked on improving the scalability of LInux kernel for high end multiprocessor servers. Designed and implemented a multi-queue linux scheduler for SMPs. Also, invented, designed and implemented a new synchronization mechanisms called Fairlocks that performed more efficient than exisiting spinlocks for NUMA multiprocessor servers. This led to a patent granted (US Patent# 6779090).

Education

  • Vrije Universiteit Amsterdam (VU Amsterdam) Graphic

    Vrije Universiteit Amsterdam (VU Amsterdam)

    Ph.D. Computer Science

    -

    - Published around 30 research papers in ACM/IEEE journals and conferences.
    - Won a best research paper award from IEEE Service Computing Technical committee.

  • Iowa State University Graphic

    Iowa State University

    M.S Computer Engineering

    -

    Activities and Societies: Teaching Assistant for Real-time systems, Research Assistant - worked on design and implementation of new RTOS scheduling algorithms, SITAR - Society for Indian Tradition and ARts

    - Passed with dual honors (research excellence and teaching excellence awards). GPA: 4.0/4.0

    - Involved in research of real-time scheduling for dynamic real-time systems. Areas focussed: Value-based Scheduling and Feedback-controlled real-time scheduling.
    - Published 2 journals and seven conference/workshop papers.
    - Awarded Research Excellence Award.

    - Taught "Digital Systems Design" and "Real-Time Systems" for undergraduate courses.
    - Awarded Teaching Excellence…

    - Passed with dual honors (research excellence and teaching excellence awards). GPA: 4.0/4.0

    - Involved in research of real-time scheduling for dynamic real-time systems. Areas focussed: Value-based Scheduling and Feedback-controlled real-time scheduling.
    - Published 2 journals and seven conference/workshop papers.
    - Awarded Research Excellence Award.

    - Taught "Digital Systems Design" and "Real-Time Systems" for undergraduate courses.
    - Awarded Teaching Excellence Award by President of ISU.
    - Was involved in building the practical lab exercises for Real-Time Systems Course (CprE 558 ) from scratch

  • College of Engineering Guindy, Chennai

    B.E. Computer Science & Engineering

    -

Skills

Publications

  • Feedback control for real-time scheduling

    American Control Conference

    Most real-time scheduling algorithms are open-loop algorithms as the scheduling decisions are based on the worst-case estimates of task parameters. In recent years, the "closed-loop" scheduling has gained importance due to its applicability to many real-world problems wherein the feedback information can be exploited efficiently to adjust task and/or scheduler parameters, thereby improving the system's performance. In this paper, we discuss an open-loop dynamic scheduling algorithm that employs…

    Most real-time scheduling algorithms are open-loop algorithms as the scheduling decisions are based on the worst-case estimates of task parameters. In recent years, the "closed-loop" scheduling has gained importance due to its applicability to many real-world problems wherein the feedback information can be exploited efficiently to adjust task and/or scheduler parameters, thereby improving the system's performance. In this paper, we discuss an open-loop dynamic scheduling algorithm that employs a notion of task overlap in the scheduler in order to provide some flexibility in task execution time. Then we present a novel closed-loop approach for dynamically estimating the execution time of tasks based on both deadline miss ratio and task rejection ratio in the system. This approach is highly preferable for firm/soft real-time systems since it provides a firm performance guarantee in terms of deadline misses while achieving a high guarantee ratio. We design the proportional-integral controller and H∞ controller for closed loop scheduling. We evaluate the performance of the open-loop and the closed-loop approaches using simulation studies. We show that the closed-loop dynamic scheduling offers a better performance over the open-loop scheduling under all practical conditions.

    Other authors
    See publication
  • Amazon Dynamo

    -

    Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components…

    Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems.

    This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon’s core services use to provide an “always-on” experience. To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.

    See publication

Patents

  • Dynamic Resource Commitment Management.

    Issued US 8,479,211

    This Elastic Block Storage (EBS) patent describes the management of the block storage to ensure that resources requested by clients are properly provisioned and available. For example, a client can request a resource. The EBS control module can create a volume, specifying the number of partitions and/or the way in which data is stored across partitions to guarantee the requested resource to the client. In addition, partitions and data storage can be updated dynamically without significantly…

    This Elastic Block Storage (EBS) patent describes the management of the block storage to ensure that resources requested by clients are properly provisioned and available. For example, a client can request a resource. The EBS control module can create a volume, specifying the number of partitions and/or the way in which data is stored across partitions to guarantee the requested resource to the client. In addition, partitions and data storage can be updated dynamically without significantly impacting the customer using the volume. The management and update of resources are optimized using various techniques, including striping data, splitting mapping data, and balancing the data in partitions.

  • Managing Route Selection in a Communication Network

    Issued US 8,472,324

    This patent relates to a way to reduce processing load on network routers in our AWS data centers. In particular, the described technology can offload pre-determined route calculations and storage from edge routers to a central processing system in order to increase edge router performance. In some cases, this facilitates the use of lower-capability, cheap commodity-based routers.

  • RESOURCE ISOLATION THROUGH REINFORCEMENT LEARNING

    Issued US US 8429096 B1

    Systems and methods for providing resource isolation in a shared computing environment using reinforcement learning (RL) techniques are disclosed. A resource isolation mechanism may be applied in a shared storage system, or database service, that limits the resource utilization of each namespace to its specified allocation. For example, the resource isolation mechanism may be used to limit the I/O utilization of database applications in a shared computing system (e.g., a system supporting a…

    Systems and methods for providing resource isolation in a shared computing environment using reinforcement learning (RL) techniques are disclosed. A resource isolation mechanism may be applied in a shared storage system, or database service, that limits the resource utilization of each namespace to its specified allocation. For example, the resource isolation mechanism may be used to limit the I/O utilization of database applications in a shared computing system (e.g., a system supporting a database service) to a specified limit. In such embodiments, RL techniques may be applied to the system to automatically control the rate of queries made by an application. RL techniques, such as those based on the State-Action-Reward-State-Action (SARSA) method may be effective in controlling the I/O utilization of database applications for different workloads. RL techniques may be applied globally by the service, or may be applied to particular subscribers, applications, shared resources, namespaces, or query types.

    See patent
  • Managing content delivery network service providers

    Issued US 12/272,699

    A system, method, and computer readable medium for managing CDN service providers are provided. A network storage provider storing one or more resources on behalf of a content provider obtains client computing device requests for content. The network storage provider processes the client computing device requests and determines whether a subsequent request for the resource should be directed to a CDN service provider as a function of the updated or processed by the network storage provider…

    A system, method, and computer readable medium for managing CDN service providers are provided. A network storage provider storing one or more resources on behalf of a content provider obtains client computing device requests for content. The network storage provider processes the client computing device requests and determines whether a subsequent request for the resource should be directed to a CDN service provider as a function of the updated or processed by the network storage provider storage component.

    See patent
  • System-aware resource scheduling

    US 8,347,302

More activity by Swami

View Swami’s full profile

  • See who you know in common
  • Get introduced
  • Contact Swami directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses