Skip to content

[Question] Changing log levels for autorecovery on connection failure to bookies #4679

@Fredorixo

Description

@Fredorixo

QUESTION

The following logs appear when the autorecovery temporarily is unable to access the bookies during pod restarts.

Can the logging for such a scenario be changed from ERROR to a WARN ? Also is there a way auto-recovery can wait for some time period for the bookies to get into running state before printing errors ?

025-10-15T13:35:08,708+0000 [ReplicationWorker] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-10-15T13:35:08,708+0000 [ReplicationWorker] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181, bookie does not exist or it is not running
2025-10-15T13:35:08,710+0000 [ReplicationWorker] ERROR org.apache.bookkeeper.replication.ReplicationWorker - ReplicationWorker failed to replicate Ledger : 3542142 for 0 number of times, so deferring the ledger lock release by 9375 msecs
2025-10-15T13:35:08,710+0000 [ReplicationWorker] WARN  org.apache.bookkeeper.replication.ReplicationWorker - failed while replicating fragments
2025-10-15T13:35:13,721+0000 [main-EventThread] INFO  org.apache.bookkeeper.meta.ZkLedgerUnderreplicationManager - Latch countdown due to ZK event: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/ledgers/underreplication/locks zxid: 472446424233
2025-10-15T13:35:13,722+0000 [ReplicationWorker] INFO  org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2025-10-15T13:35:13,723+0000 [ReplicationWorker] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181, bookie does not exist or it is not running
2025-10-15T13:35:13,723+0000 [ReplicationWorker] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: null/pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181, current state CONNECTING :
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId pulsar-azure-us-west-2-bookkeeper2-2.pulsar-azure-us-west-2-bookkeeper2.pulsar.svc.cluster.local:3181, bookie does not exist or it is not running
        at org.apache.bookkeeper.client.DefaultBookieAddressResolver.resolve(DefaultBookieAddressResolver.java:66) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.proto.PerChannelBookieClient.connect(PerChannelBookieClient.java:399) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.proto.PerChannelBookieClient.connectIfNeededAndDoOp(PerChannelBookieClient.java:525) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.proto.DefaultPerChannelBookieClientPool.obtain(DefaultPerChannelBookieClientPool.java:120) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.proto.DefaultPerChannelBookieClientPool.obtain(DefaultPerChannelBookieClientPool.java:115) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.proto.BookieClientImpl.readEntry(BookieClientImpl.java:514) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.proto.BookieClientImpl.readEntry(BookieClientImpl.java:500) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.proto.BookieClientImpl.readEntry(BookieClientImpl.java:494) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.client.LedgerChecker.checkLedger(LedgerChecker.java:455) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.replication.ReplicationWorker.getDataLossFragments(ReplicationWorker.java:631) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.replication.ReplicationWorker.getUnderreplicatedFragments(ReplicationWorker.java:615) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.replication.ReplicationWorker.rereplicate(ReplicationWorker.java:457) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.replication.ReplicationWorker.rereplicate(ReplicationWorker.java:302) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.replication.ReplicationWorker.run(ReplicationWorker.java:250) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty-netty-common-4.1.122.Final.jar:4.1.122.Final]
        at java.base/java.lang.Thread.run(Unknown Source) [?:?]
Caused by: org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
        at org.apache.bookkeeper.discover.ZKRegistrationClient.getBookieServiceInfo(ZKRegistrationClient.java:226) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        at org.apache.bookkeeper.client.DefaultBookieAddressResolver.resolve(DefaultBookieAddressResolver.java:45) ~[com.datastax.oss-bookkeeper-server-4.17.1.0.0.3.jar:4.17.1.0.0.3]
        ... 15 more

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions