I have a 12GB GPU but attempting to train anything with the default settings produces an OOM on the first epoch. I had to dial the batch_size and the dilation_depth way down before it would even start. What settings are you using when you train?
I tensorflow/core/common_runtime/bfc_allocator.cc:689] Summary of in-use Chunks by size:
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 83 Chunks of size 256 totalling 20.8KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 512 totalling 512B
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 15 Chunks of size 1024 totalling 15.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 1280 totalling 1.2KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 65536 totalling 64.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 59 Chunks of size 262144 totalling 14.75MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 520704 totalling 508.5KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 105 Chunks of size 524288 totalling 52.50MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 13 Chunks of size 67108864 totalling 832.00MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 2 Chunks of size 67174400 totalling 128.12MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 67239936 totalling 64.12MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 67371008 totalling 64.25MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 67633152 totalling 64.50MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 68157440 totalling 65.00MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 134479872 totalling 128.25MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 269484032 totalling 257.00MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 541065216 totalling 516.00MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 1090519040 totalling 1.02GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 2147483648 totalling 2.00GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 2214592512 totalling 2.06GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 3726535936 totalling 3.47GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] Sum Total of in-use chunks: 10.68GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:698] Stats:
Limit: 11715375924
InUse: 11472467200
MaxInUse: 11473515776
NumAllocs: 563
MaxAllocSize: 3980291328
W tensorflow/core/common_runtime/bfc_allocator.cc:270] ****************************************************************************************xxxxxxxxxxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 2.00GiB. See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:968] Resource exhausted: OOM when allocating tensor with shape[65536,256,32,1]
I have a 12GB GPU but attempting to train anything with the default settings produces an OOM on the first epoch. I had to dial the batch_size and the dilation_depth way down before it would even start. What settings are you using when you train?