Python Logging.

Sun, Apr 21, 2024

Tags: Software, Python, Logging, Intermediate Python

Two things I keep seeing:

People using f strings in logging.
People calling expensive functions eagerly in the log event to fetch debugging data.

I see it so much at this point that I’m not even correcting it when I see it in the most of the codebases I use. It seems too controversial of a thing to bring up, and honestly it’s not worth spending some of your capital with your colleagues on. While I do love fstrings, there are some great reasons outside of the above to not use them.

import logging
import time

log = logging.getLogger(__name__)

def fetch_something_expensive():
    time.sleep(1)
    return 'potato'

while something():
	...
	log.info(f"Data={fetch_something_expensive()}")

In the above code, the string interpolation will always happen. If the log level is set to error, the string interpolation will still happen. Even more than this, the expensive function call will also always happen.

With the dis module of Python we can prove this by disassembling some code into bytecode.

import logging

def expensive_calculation():
    # Simulating an expensive calculation
    result = sum(range(1000000))
    return result

def log_with_fstring():
    logging.info(f"Result of expensive calculation: {expensive_calculation()}")

If we then use the dis.dis function to disassemble, we can see what’s going on.

dis.dis(log_with_fstring)
  5           0 LOAD_GLOBAL              0 (logging)
              2 LOAD_METHOD              1 (info)
              4 LOAD_CONST               1 ('Result of expensive calculation: ')
              6 LOAD_GLOBAL              2 (expensive_calculation)
              8 CALL_FUNCTION            0
             10 FORMAT_VALUE             0
             12 BUILD_STRING             2
             14 CALL_METHOD              1
             16 POP_TOP
             18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

Lines 1 to 4 are loads of modules, methods, strings and the respective function onto the stack for access. Line 5 invokes CALL_FUNCTION which calls the function with 0 arguments for inserting into the string, at this point we have a result from the function call on the stack. Lines 6 and 7 format the value using FORMAT_VALUE, if we had used any formatters then this is where they would take affect, e.g. for trimming decimal places on a float. We’ve now got a fully built string on the stack, everything else is cleaning up the stack and returning the function call.

Now using printf-style formatting:

def log_with_printf():
    logging.info("Result of expensive calculation: %s", expensive_calculation())

And disassembling into bytecode again…

dis.dis(log_with_printf)
  2           0 LOAD_GLOBAL              0 (logging)
              2 LOAD_METHOD              1 (info)
              4 LOAD_CONST               1 ('Result of expensive calculation: %s')
              6 LOAD_GLOBAL              2 (expensive_calculation)
              8 CALL_FUNCTION            0
             10 CALL_METHOD              2
             12 POP_TOP
             14 LOAD_CONST               0 (None)
             16 RETURN_VALUE

Lines 1 to 5 are the same, we’ve setup the local state for the function call and then run our expensive calculation. There’s no getting around this as the function call is within the scope of what’s being evaluated, if you wanted to delay this then maybe you could make it a callable that only runs when the __str__ dunder method is called.

Line 6, CALL_METHOD 2 is our method call to the logging class with our two args, the unformatted string and our argument to pass in. The rest of the cleanup is the same.

If we did want to delay the evaluation of our expensive calculation all together, we could use some kind of lazy class like the following:

class LazyString:
	def __str__(self):
		return expensive_calculation()

And the updated code…

def log_with_lazy():
	logging.info("Result of expensive calculation: %s", LazyString())

Would give you a lazy evaluating string that only does the expensive calculation IF the log level is high enough.