技术

Durid abandon connection错误分析

2019-04-26  本文已影响0人  那些年搬过的砖
最近线上碰到一个问题,提示放弃数据库连接
2019-04-23 15:15:15.137 ERROR [Druid-ConnectionPool-Destroy-29530630]com.alibaba.druid.pool.DruidDataSource.removeAbandoned():2437 -abandon connection, owner thread: Thread-2187, connected at : 1556001108595, open stackTrace
    at java.lang.Thread.getStackTrace(Thread.java:1559)
    at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1252)
    at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:4619)
    at com.alibaba.druid.filter.logging.LogFilter.dataSource_getConnection(LogFilter.java:874)
    at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:4615)
    at com.alibaba.druid.filter.stat.StatFilter.dataSource_getConnection(StatFilter.java:666)
    at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:4615)
    at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1176)
    at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1168)
    at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:104)
......
2019-04-23 15:15:15.137 ERROR [Druid-ConnectionPool-Destroy-29530630]com.alibaba.druid.pool.DruidDataSource.removeAbandoned():2437 -abandon connection, owner thread: Thread-2187, connected at : 1556001108595, open stackTrace
    at java.lang.Thread.getStackTrace(Thread.java:1559)
    at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1252)
    at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:4619)
    at com.alibaba.druid.filter.logging.LogFilter.dataSource_getConnection(LogFilter.java:874)
    at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:4615)
    at com.alibaba.druid.filter.stat.StatFilter.dataSource_getConnection(StatFilter.java:666)
    at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:4615)
    at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1176)
    at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1168)
    at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:104)
spring事务的执行有以下几个步骤
1:获取连接 Connection con = DriverManager.getConnection()
2:取消事务的自动提交con.setAutoCommit(false);
3:设置事务的传播属性
4:设置事务的超时时间
5:执行CURD
6:提交事务/回滚事务 con.commit() / con.rollback();
7:关闭连接 conn.close();
@Override
    protected void doBegin(Object transaction, TransactionDefinition definition) {
        DataSourceTransactionObject txObject = (DataSourceTransactionObject) transaction;
        Connection con = null;

        try {
            if (txObject.getConnectionHolder() == null ||
                    txObject.getConnectionHolder().isSynchronizedWithTransaction()) {
                Connection newCon = this.dataSource.getConnection();
                if (logger.isDebugEnabled()) {
                    logger.debug("Acquired Connection [" + newCon + "] for JDBC transaction");
                }
                txObject.setConnectionHolder(new ConnectionHolder(newCon), true);
            }

            txObject.getConnectionHolder().setSynchronizedWithTransaction(true);
            //获取连接
            con = txObject.getConnectionHolder().getConnection();

            Integer previousIsolationLevel = DataSourceUtils.prepareConnectionForTransaction(con, definition);
            txObject.setPreviousIsolationLevel(previousIsolationLevel);

            // Switch to manual commit if necessary. This is very expensive in some JDBC drivers,
            // so we don't want to do it unnecessarily (for example if we've explicitly
            // configured the connection pool to set it already).
            //关闭自动提交
            if (con.getAutoCommit()) {
                txObject.setMustRestoreAutoCommit(true);
                if (logger.isDebugEnabled()) {
                    logger.debug("Switching JDBC Connection [" + con + "] to manual commit");
                }
                con.setAutoCommit(false);
            }

            //设置事务传播属性
            prepareTransactionalConnection(con, definition);
            txObject.getConnectionHolder().setTransactionActive(true);

            int timeout = determineTimeout(definition);
            //设置事务超时时间
            if (timeout != TransactionDefinition.TIMEOUT_DEFAULT) {
                txObject.getConnectionHolder().setTimeoutInSeconds(timeout);
            }

            // Bind the connection holder to the thread.
            if (txObject.isNewConnectionHolder()) {
                TransactionSynchronizationManager.bindResource(getDataSource(), txObject.getConnectionHolder());
            }
        }

        catch (Throwable ex) {
            if (txObject.isNewConnectionHolder()) {
                DataSourceUtils.releaseConnection(con, this.dataSource);
                txObject.setConnectionHolder(null, false);
            }
            throw new CannotCreateTransactionException("Could not open JDBC Connection for transaction", ex);
        }
    }

从上面代码我们可以得到这么几个重要的信息

1:数据的连接是在事务开始时就获取,而不是在执行curd时才获取。所以即时你的curd很快,但如果整体事务时间较长,仍然会超出数据源的最大时间限制(这个时间并不是事务的超时时间,而是数据源自身的一种保护机制,后面可以看到)

2:事务的隔离级别和超时时间是解析的@Transactional注解。

我们再来看下druid数据源的配置信息
<!-- 配置间隔多久才进行一次检测,检测需要关闭的空闲连接,单位是毫秒 -->
<property name="timeBetweenEvictionRunsMillis" value="${timeBetweenEvictionRunsMillis}"></property>
<!-- 打开removeAbandoned功能 -->
<property name="removeAbandoned" value="true"/>
<!--&lt;!&ndash; 1800秒,也就是30分钟 &ndash;&gt;-->
<property name="removeAbandonedTimeout" value="1800"/>

这两个配置,可以理解为一种数据源自身的保护,如果Druid的销毁线程检测到数据库连接的占用时间已经超过了removeAbandonedTimeout设置的阈值,那么会强制断开此次数据库连接。

实际业务中,可能数据库连接占用时间超过removeAbandonedTimeout才被强制关闭。假如我们定义removeAbandonedTimeout为T1,timeBetweenEvictionRunsMillis为T2,那么事务占用数据库连接的最长时间在[T1, T1+T2]之间,也就是实际占用连接时间可能超过连接占用最大时间的阈值,这个要看轮询频率的大小而定。

//连接销毁任务
public class DestroyTask implements Runnable {
    @Override
    public void run() {
        shrink(true, keepAlive);
        if (isRemoveAbandoned()) {
            removeAbandoned();
        }
    }
}
public int removeAbandoned() {
    int removeCount = 0;

    long currrentNanos = System.nanoTime();

    List<DruidPooledConnection> abandonedList = new ArrayList<DruidPooledConnection>();

    activeConnectionLock.lock();
    try {
        Iterator<DruidPooledConnection> iter = activeConnections.keySet().iterator();

        for (; iter.hasNext();) {
            DruidPooledConnection pooledConnection = iter.next();

            if (pooledConnection.isRunning()) {
                continue;
            }

            long timeMillis = (currrentNanos - pooledConnection.getConnectedTimeNano()) / (1000 * 1000);

            //计算时间,超出阈值的添加到销毁列表中
            if (timeMillis >= removeAbandonedTimeoutMillis) {
                iter.remove();
                pooledConnection.setTraceEnable(false);
                abandonedList.add(pooledConnection);
            }
        }
    } finally {
        activeConnectionLock.unlock();
    }
    ...
}

以下伪代码中,即时操作2非常快,但是操作1和操作3较慢,有可能导致数据库连接未释放,而数据源本身的保护机制可能会强制放弃该连接,导致数据库操作失败。所以最好的方式是尽量将数据库操作单独添加事务,而不是将整个与数据库操作毫无关系的操作放到一起,这样会占用不必要的数据库资源(注意:不能简单的将操作2写到一个内部方法中,因为spring事务的代理都是针对外层方法调用,所以最好是将操作2提炼到一个service中,再添加事务支持)

@Transactional(...)
public a(){
    //非数据库操作1(IO操作等,较慢)
    //数据库操作2(非常快)
    //非数据库操作3(较慢)
}
上一篇下一篇

猜你喜欢

热点阅读