DBCP数据库连接打满原因分析

时间：2023-05-15 14:25:40 来源：今日头条作者：东东程序猿

实验背景

近一年来发生几起的数据库连接被打满的情况，初步分析是应用使用连接数量“超过了”连接池（DBCP1.4）的上限，导致数据库连接被打满，其中一个结论是连接池的bug导致

问题分析

1、DBCP连接池的Bug导致连接数超过大小设置，其根本原因是在连接池初始化的时候，有可能创建多个连接池导致

protected synchronized DataSource createDataSource()
        throws SQLException {
        if (closed) {
            throw new SQLException("Data source is closed");
        }

        // Return the pool if we have already created it
        if (dataSource != null) {
            return (dataSource);
        }

        // create factory which returns raw physical connections
        ConnectionFactory driverConnectionFactory = createConnectionFactory();

        // create a pool for our connections
        createConnectionPool();

        // Set up statement pool, if desired
        GenericKeyedObjectPoolFactory statementPoolFactory = null;
        if (isPoolPreparedStatements()) {
            statementPoolFactory = new GenericKeyedObjectPoolFactory(null,
                        -1, // unlimited maxActive (per key)
                        GenericKeyedObjectPool.WHEN_EXHAUSTED_FAIL,
                        0, // maxWait
                        1, // maxIdle (per key)
                        maxOpenPreparedStatements);
        }

        // Set up the poolable connection factory
        createPoolableConnectionFactory(driverConnectionFactory, statementPoolFactory, abandonedConfig);

        // Create and return the pooling data source to manage the connections
        createDataSourceInstance();

        try {
            for (int i = 0 ; i < initialSize ; i++) {
                connectionPool.addObject();
            }
        } catch (Exception e) {
            throw new SQLNestedException("Error preloading the connection pool", e);
        }

        return dataSource;
    }

createDataSource方法里边调用createConnectionPool方法，如果后面方式执行失败，比如createPoolableConnectionFactory，当再次调用createDataSource，又会再次调用createConnectionPool，从而初始化多次连接池，下面是createConnectionPool

 protected void createConnectionPool() {
        // Create an object pool to contain our active connections
        GenericObjectPool gop;
        if ((abandonedConfig != null) && (abandonedConfig.getRemoveAbandoned())) {
            gop = new AbandonedObjectPool(null,abandonedConfig);
        }
        else {
            gop = new GenericObjectPool();
        }
        gop.setMaxActive(maxActive);
        gop.setMaxIdle(maxIdle);
        gop.setMinIdle(minIdle);
        gop.setMaxWait(maxWait);
        gop.setTestOnBorrow(testOnBorrow);
        gop.setTestOnReturn(testOnReturn);
        gop.setTimeBetweenEvictionRunsMillis(timeBetweenEvictionRunsMillis);
        gop.setNumTestsPerEvictionRun(numTestsPerEvictionRun);
        gop.setMinEvictableIdleTimeMillis(minEvictableIdleTimeMillis);
        gop.setTestWhileIdle(testWhileIdle);
        connectionPool = gop;
    }

显然这个情况只能是模块启动初始化的时候产生，且连接都不会被使用，而线上出现的问题是连接都在执行sql，不是这种情况！在1.4x的这个bug已经修复
2、启动参数分析，从启动参数入手，看一下是不是连接池主动释放连接，导致正在使用的连接被释放，从而创建新的连接，给我们的现象是“连接数”超过了连接池的限制，下面是一个入库的连接池参数

partition1.driverClassName=com.MySQL.jdbc.Driver
partition1.initialSize=2
partition1.maxActive=25
partition1.minIdle=2
partition1.maxIdle=5
partition1.maxWait=3000
partition1.threadPoolSize=10
partition1.logAbandoned=true
partition1.testWhileIdle=true
partition1.testOnReturn=false
partition1.testOnBorrow=true
partition1.validationQuery=select now()
//在每次空闲连接回收器线程(如果有)运行时检查的连接数量
partition1.numTestsPerEvictionRun=5
//在空闲连接回收器线程运行期间休眠的时间值,以毫秒为单位
partition1.timeBetweenEvictionRunsMillis=30000
//连接在池中保持空闲而不被空闲连接回收器线程
partition1.minEvictableIdleTimeMillis=180000
//设置了rmoveAbandoned=true 那么在getNumactive()快要到getMaxActive()的时候，系统会进行无效的Connection的回收，回收的 Connection为removeAbandonedTimeout(默认300秒)中设置的秒数后没有使用的Connection
partition1.removeAbandoned=true
//强制回收连接的时间，单位秒
partition1.removeAbandonedTimeout=18

注意connectionProperties参数

<bean id="partition[j]" class="org.Apache.commons.dbcp.BasicDataSource"
          destroy-method="close">
        <property name="driverClassName" value="${partition[j].driverClassName}" ></property>
        <property name="url" value="${partition[j].url}" ></property>
        <property name="username" value="${partition[j].username}" ></property>
        <property name="password" value="${partition[j].password}" ></property>
        <property name="defaultAutoCommit" value="false" ></property>
        <property name="maxActive" value="${partition[j].maxActive}" ></property>
        <property name="maxIdle" value="${partition[j].maxIdle}" ></property>
        <property name="maxWait" value="${partition[j].maxWait}" ></property>
        <property name="initialSize" value="${partition[j].initialSize}" ></property>
        <property name="minIdle" value="${partition[j].minIdle}" ></property>
        <property name="logAbandoned" value="${partition[j].logAbandoned}" ></property>
        <property name="testWhileIdle" value="${partition[j].testWhileIdle}" ></property>
        <property name="testOnReturn" value="${partition[j].testOnReturn}" ></property>
        <property name="testOnBorrow" value="${partition[j].testOnBorrow}" ></property>
        <property name="validationQuery" value="${partition[j].validationQuery}" ></property>
        <property name="numTestsPerEvictionRun" value="${partition[j].numTestsPerEvictionRun}" ></property>
        <property name="timeBetweenEvictionRunsMillis" value="${partition[j].timeBetweenEvictionRunsMillis}" ></property>
        <property name="minEvictableIdleTimeMillis" value="${partition[j].minEvictableIdleTimeMillis}" ></property>
        <property name="removeAbandoned" value="${partition[j].removeAbandoned}" ></property>
        <property name="removeAbandonedTimeout" value="${partition[j].removeAbandonedTimeout}" ></property>
        <property name="connectionProperties" value="useUnicode=true;
   characterEncoding=utf8;initialTimeout=1;connectTimeout=1000;socketTimeout=6000;
   rewriteBatchedStatements=true;autoReconnectForPools=true;autoReconnect=true;maxReconnects=1;
   failOverReadOnly=false;roundRobinLoadBalance=true;allowMultiQueries=true"></property>
    </bean>

从上面参数我们重点关注removeAbandonedTimeout 这个参数的意义

  //创建连接
   public Object borrowObject() throws Exception {
        if (config != null
                && config.getRemoveAbandoned()
                && (getNumIdle() < 2)
                && (getNumActive() > getMaxActive() - 3) ) {
            removeAbandoned();
        }
        Object obj = super.borrowObject();
        if (obj instanceof AbandonedTrace) {
            ((AbandonedTrace) obj).setStackTrace();
        }
        if (obj != null && config != null && config.getRemoveAbandoned()) {
            synchronized (trace) {
                trace.add(obj);
            }
        }
    ...

    private void removeAbandoned() {
        // Generate a list of abandoned connections to remove
        long now = System.currentTimeMillis();
        long timeout = now - (config.getRemoveAbandonedTimeout() * 1000);
        ArrayList remove = new ArrayList();
        synchronized (trace) {
            Iterator it = trace.iterator();
            while (it.hasNext()) {
                AbandonedTrace pc = (AbandonedTrace) it.next();
                if (pc.getLastUsed() > timeout) {
                    continue;
                }
                if (pc.getLastUsed() > 0) {
                    remove.add(pc);
                }
            }
        }

        // Now remove the abandoned connections
        Iterator it = remove.iterator();
        while (it.hasNext()) {
            AbandonedTrace pc = (AbandonedTrace) it.next();
            if (config.getLogAbandoned()) {
                pc.printStackTrace();
            }             
            try {
                invalidateObject(pc);
            } catch (Exception e) {
                e.printStackTrace();
            }

        }
    }

代从代码可以看出，配置了removeAbandonedOnBorrow和removeAbandonedTimeout就是触发正在使用的连接remove，假如这个时候，数据库正在执行sql，就会出现连接池已经断开连接，而数据库的连接还在执行，造成数据库的连接超过连接池的显现，如果少量慢sql执行时间大于removeAbandonedTimeout，不会出现问题，出现大量的慢sql，就会导致数据库中的这种连接越来越多，最后把数据库打满

实验过程

创建测试库

mkdir /usr/local/data/mysql
Docker run -d -e MYSQL_ROOT_PASSWORD=root --name centos/mysql-57-centos7 -v /usr/local/data/mysql:/var/lib/mysql -p 3306:3306 mysql --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --lower_case_table_names=1

创建测试库

create database test ;

构建测试类，核心代码

public void setUp() throws Exception {
        ds = createDataSource();
        ds.setDriverClassName("com.mysql.jdbc.Driver");
        ds.setUrl("jdbc:mysql://127.0.0.1:3306/test?allowMultiQueries=true&characterEncoding=UTF-8&autoReconnect=true&failOverReadOnly=false&useSSL=false&allowPublicKeyRetrieval=true");
        ds.setUsername("root");
        ds.setPassword("Htbuy@2016");
        ds.setMaxActive(1);
        ds.setMaxWait(1000);
        ds.setTestWhileIdle(true);
        ds.setTestOnBorrow(true);
        ds.setTestOnReturn(false);
        ds.setValidationQuery("select now()");
        ds.setNumTestsPerEvictionRun(5);
        ds.setMinEvictableIdleTimeMillis(2000);
        ds.setLogAbandoned(true);
        ds.setRemoveAbandoned(true);
        ds.setRemoveAbandonedTimeout(1);
}
public void testAbandoned() throws Exception {

        for (int i = 0; i < 20; i++) {
            Thread t = new Thread(new Runnable() {
                @Override
                public void run() {
                    try {
                        Connection conn = ds.getConnection();
                        Statement statement = conn.createStatement();
                        ResultSet resultSet = null; 
    #模拟慢sql
                        resultSet = statement.executeQuery("select  sleep(100),now()");
                        while (resultSet.next()) {
                            System.out.println("result+" + resultSet.getString(1));
                        }
                        resultSet.close();
                        statement.close();
                        conn.close();
                    } catch (Exception ex) {
                        System.out.println(ex.getMessage());

                    }
                    System.out.println(Thread.currentThread().getName() + "---------------------- end----------------------");
                }
            });
            t.setName(i + "");
            t.start();
            Thread.sleep((i + 1) * 1000);
        }
        System.out.println(Thread.currentThread().getName() + "---------------------- end----------------------");
        Thread.sleep(1000000);
    }

通过运行上面代码，并没有像我们所期望那样，超过连接的最大限制1，通过RemoveAbandoned=true和RemoveAbandonedTimeout=1 应该已经触发了连接池的Abandoned机制，但是都阻塞到下面的一行代码，DelegatingStatement的close方法

/**
     * Close this DelegatingStatement, and close
     * any ResultSets that were not explicitly closed.
     */
    public void close() throws SQLException {
        try {
            try {
                if (_conn != null) {
                    _conn.removeTrace(this);
                    _conn = null;
                }

                // The JDBC spec requires that a statment close any open
                // ResultSet's when it is closed.
                // FIXME The PreparedStatement we're wrApping should handle this for us.
                // See bug 17301 for what could happen when ResultSets are closed twice.
                List resultSets = getTrace();
                if( resultSets != null) {
                    ResultSet[] set = (ResultSet[]) resultSets.toArray(new ResultSet[resultSets.size()]);
                    for (int i = 0; i < set.length; i++) {
                        set[i].close();
                    }
                    clearTrace();
                }
//阻塞地方
                _stmt.close();
            }
            catch (SQLException e) {
                handleException(e);
            }
        }
        finally {
            _closed = true;
        }
    }

当我们在url设置socketTimeout=1000，这时候阻塞的地方成功执行完成，超过数据库出现大量的连接数1，问题重现！同时出现典型的日志The last packet successfully received from the server was 1,001 milliseconds ago. The last packet sent successfully to the server was 1,001 milliseconds ago.

当设置socketTimeout=1000，RemoveAbandoned=false ，并没有重现问题，且都是大量的等待连接超时

原因是下面的代码差异，开启RemoveAbandoned，使用的是AbandonedObjectPool，而不开启是默认的GenericObjectPool，AbandonedObjectPool增加了Abandoned逻辑

 */
    protected void createConnectionPool() {
        // Create an object pool to contain our active connections
        GenericObjectPool gop;
        if ((abandonedConfig != null) && (abandonedConfig.getRemoveAbandoned())) {
            gop = new AbandonedObjectPool(null,abandonedConfig);
        }
        else {
            gop = new GenericObjectPool();
        }
        gop.setMaxActive(maxActive);
        gop.setMaxIdle(maxIdle);
        gop.setMinIdle(minIdle);
        gop.setMaxWait(maxWait);
        gop.setTestOnBorrow(testOnBorrow);
        gop.setTestOnReturn(testOnReturn);
        gop.setTimeBetweenEvictionRunsMillis(timeBetweenEvictionRunsMillis);
        gop.setNumTestsPerEvictionRun(numTestsPerEvictionRun);
        gop.setMinEvictableIdleTimeMillis(minEvictableIdleTimeMillis);
        gop.setTestWhileIdle(testWhileIdle);
        connectionPool = gop;
    }

总结

当RemoveAbandoned=true，且执行时间超过socketTimeout ，达到RemoveAbandonedTimeout的触发点时，就会导致数据库连接数超过连接池的限制，注意这种情况关闭模块是没用的，sql还在数据库中执行，应该直接kill或者切库操作！！

优化方案

1、关闭RemoveAbandoned设置，目前wms系统的还是AP和TP混合型，有很大的几率触发这种情况
2、经过邱玉堃复核模块测试，需要关闭socketTimeout参数，引用的1.4.jar版本与源码的1.4版本存在差异
3、注意当mysql-connector-JAVA 版本小于5.1.45就会出现这个bug

Tags：数据库点击:() 评论:()

声明：本站部分内容及图片来自互联网,转载是出于传递更多信息之目的,内容观点仅代表作者本人,不构成投资建议。投资者据此操作，风险自担。如有任何标注错误或版权侵犯请与我们联系，我们将及时更正、删除。

▌相关推荐

向量数据库落地实践

本文基于京东内部向量数据库vearch进行实践。Vearch 是对大规模深度学习向量进行高性能相似搜索的弹性分布式系统。详见： https://github.com/vearch/zh_docs/blob/v3.3.X/do...【详细内容】

2024-04-03　　Search: 数据库点击:(5)　　评论:(0)　　加入收藏

如何正确选择NoSQL数据库

译者 | 陈峻审校 | 重楼Allied Market Research最近发布的一份报告指出，业界对于NoSQL数据库的需求正在持续上升。2022年，全球NoSQL市场的销售额已达73亿美元，预计到2032年将达...【详细内容】

2024-03-28　　Search: 数据库点击:(14)　　评论:(0)　　加入收藏

为什么数据库连接池不采用 IO 多路复用？

这是一个非常好的问题。IO多路复用被视为是非常好的性能助力器。但是一般我们在使用DB时，还是经常性采用c3p0，tomcat connection pool等技术来与DB连接，哪怕整个程序已经变成以...【详细内容】

2024-03-27　　Search: 数据库点击:(13)　　评论:(0)　　加入收藏

过去一年，我看到了数据库领域的十大发展趋势

作者 | 朱洁策划 | 李冬梅过去一年，行业信心跌至冰点2022 年中，红衫的一篇《适应与忍耐》的报告，对公司经营提出了预警，让各个公司保持现金流，重整团队，想办法增加盈利。这篇报告...【详细内容】

2024-03-12　　Search: 数据库点击:(27)　　评论:(0)　　加入收藏

让数据库和缓存数据保持一致的三种策略

如何保证缓存和数据库的一致性，这算得上是个老生常谈的话题啦，看到好多技术新人在写更新缓存数据代码，采用了非常复杂甚至“诡异”的方案，甚为不解。一、背景目前随着缓存架构方...【详细内容】

2024-02-20　　Search: 数据库点击:(37)　　评论:(0)　　加入收藏

MySQL数据库如何生成分组排序的序号

经常进行数据分析的小伙伴经常会需要生成序号或进行数据分组排序并生成序号。在MySQL8.0中可以使用窗口函数来实现，可以参考历史文章有了这些函数，统计分析事半功倍进行了解。...【详细内容】

2024-01-30　　Search: 数据库点击:(54)　　评论:(0)　　加入收藏

一篇文章，彻底理解数据库操作语言：DDL、DML、DCL、TCL

本篇文章以具体的SQL语句讲解了数据库SQL语言四大分类（数据定义语言DDL，数据操作语言DML，数据查询语言DQL，数据控制语言DCL），同时也介绍了事务控制语言TCL。最近与开发和运维讨论...【详细内容】

2024-01-30　　Search: 数据库点击:(43)　　评论:(0)　　加入收藏

一文读懂：什么是数据库，它到底有啥用？

提到数据库，可能很多人会很陌生。但据库其实已经渗入我们生活的方方面面，像网上购物、扫码点餐、抢红包等等应用背后都离不开数据库的支持。可以说数据库是支撑各类应用软件运...【详细内容】

2024-01-25　　Search: 数据库点击:(43)　　评论:(0)　　加入收藏

oracle数据库基础学习

在当今数字化时代，数据库已成为企业运营的关键要素。而Oracle数据库，作为全球领先的企业级数据库管理系统，更是备受推崇。本文将带您深入了解Oracle数据库的基础知识，帮助您从零...【详细内容】

2024-01-20　　Search: 数据库点击:(90)　　评论:(0)　　加入收藏

一个流行的支持超多数据库的ORM库

Sequelize 是一个流行的 Node.js ORM（对象关系映射）库，用于在 Node.js 中操作关系型数据库。它支持多种数据库系统，如 PostgreSQL、MySQL、SQLite 和 MSSQL，并提供了简单易用的 A...【详细内容】

2024-01-15　　Search: 数据库点击:(77)　　评论:(0)　　加入收藏

▌简易百科推荐

向量数据库落地实践

2024-04-03　　京东云开发者　　　　Tags:向量数据库　点击:(5)　　评论:(0)　　加入收藏

原来 SQL 函数是可以内联的！

介绍在某些情况下，SQL 函数（即指定LANGUAGE SQL）会将其函数体内联到调用它的查询中，而不是直接调用。这可以带来显著的性能提升，因为函数体可以暴露给调用查询的规划器，从而规划器...【详细内容】

2024-04-03　　红石PG　　微信公众号　　Tags:SQL 函数　点击:(4)　　评论:(0)　　加入收藏

如何正确选择NoSQL数据库

2024-03-28　　　　51CTO　　Tags:NoSQL 　点击:(14)　　评论:(0)　　加入收藏

为什么数据库连接池不采用 IO 多路复用？

2024-03-27　　dbaplus社群　　　　Tags:数据库连接池　点击:(13)　　评论:(0)　　加入收藏

八个常见的数据可视化错误以及如何避免它们

在当今以数据驱动为主导的世界里，清晰且具有洞察力的数据可视化至关重要。然而，在创建数据可视化时很容易犯错误，这可能导致对数据的错误解读。本文将探讨一些常见的糟糕数据可...【详细内容】

2024-03-26　　DeepHub IMBA　　微信公众号　　Tags:数据可视化　点击:(7)　　评论:(0)　　加入收藏

到底有没有必要分库分表，如何考量的

关于是否需要进行分库分表，可以根据以下考量因素来决定：数据量和负载：如果数据量巨大且负载压力较大，单一库单一表可能无法满足性能需求，考虑分库分表。数据增长：预估数据增长...【详细内容】

2024-03-20　　码上遇见你　　微信公众号　　Tags:分库分表　点击:(15)　　评论:(0)　　加入收藏

在 SQL 中写了 in 和 not in，技术总监说要炒了我……

WHY？IN 和 NOT IN 是比较常用的关键字，为什么要尽量避免呢？1、效率低项目中遇到这么个情况：t1表和 t2表都是150w条数据，600M的样子，都不算大。但是这样一句查询 ↓select *...【详细内容】

2024-03-18　　dbaplus社群　　　　Tags:SQL 　点击:(6)　　评论:(0)　　加入收藏

应对慢SQL的致胜法宝：7大实例剖析+优化原则

大促备战，最大的隐患项之一就是慢SQL，对于服务平稳运行带来的破坏性最大，也是日常工作中经常带来整个应用抖动的最大隐患，在日常开发中如何避免出现慢SQL，出现了慢SQL应该按照什...【详细内容】

2024-03-14　　京东云开发者　　　　Tags:慢SQL 　点击:(5)　　评论:(0)　　加入收藏

过去一年，我看到了数据库领域的十大发展趋势

2024-03-12　　　　InfoQ　　Tags:数据库　点击:(27)　　评论:(0)　　加入收藏

SQL优化的七个方法，你会哪个？

一、插入数据优化普通插入：在平时我们执行insert语句的时候，可能都是一条一条数据插入进去的，就像下面这样。INSERT INTO `department` VALUES(1, '研发部(RD)', &#39...【详细内容】

2024-03-07　　程序员恰恰　　微信公众号　　Tags:SQL优化　点击:(20)　　评论:(0)　　加入收藏

推荐资讯

整治“暗箱操作” 义	网易再牵暴雪的手，实际
注意！密码、验证码都没	将他人商标设为搜索关
打破刚兑：投资者还能相	拜登坐不住了？罕见对美
黄金狂飙如何影响人民	重新审视2008年全球金