Python can handle this case and much more fine. This is another place where you shouldn't blame Python. A few thousand files should be iterating in under a second and the depth of the directory tree shouldn't matter. At least unless it's not an SSD at any rate.
First, the solution to all your woes is the glob module. You can grab everything with it, then iterate over the list. You might have to use os.path to process them,assuming you ever actually don't want the root, though, but that's not a big deal.
Second, if you're not going to use the glob module you should probably match using a pre-compiled regex. I'm not sure what your trim function is doing but I'm pretty sure that's where the problem is going to be. I'm guessing that you either do a whole lot of string splitting by hand, or you make a regex every time through it. Either is pretty bad. Just looking at the arguments tells me that its either a very weird function or a very complicated function that does more than it appears that it should.
As I recall the regex you need is:
Which you want to use re.compile on up front. That might be slightly wrong. To build it:
ext_joined = "|".join(formats)
pattern = "\.({})$".format(ext_joined)
regex = re.compile(pattern)
Then you use the regex object for all your matching. Be careful to guard against the case of no formats; that will produce an invalid pattern (you must avoid compiling it at all; I'm reasonably sure the result is syntactically invalid regex).
This:
if len(formats)==0 or ext in formats:
Can be:
Since nothing is ever in the empty list. That won't shave off a lot of time, but it's cleaner code. Your next response is going to be "but ext isn't defined": that's fine, just do ext = "" before the optional trim call and it'll always be a value.
This shouldn't be an async function because everything in the function is synchronous. Putting the word async before a function doesn't make it work on the main thread of the event loop unless you're also doing await inside the function. In general no file operations can be made async because of OS limitations, so you either have to eat the cost of being synchronous for small amounts of time on the main event loop and do periodic async sleeps, or move it to a thread pool. I haven't done enough with Python's async to know exactly what to point you at, but there are "please run this in a thread pool and give me a future" functions in the standard library that will do what you want and this particular case is literally what they were invented for.
I would be curious as to your timing methodology. There is a module called timeit. You do:
import timeit
def test():
# some code you want to time
count = 100
res = timeit.timeit(test, number=count)
print(res / count, "per second")
Number defaults to 1000000 so you should always pass it. This tells you how much time a function takes per second. For really fast functions you want higher numbers; for long functions like directory iterating, really low ones are fine. If your function isn't async (which as I said it shouldn't be) this can be used to time it, but it can also be used to time your trim function alone which may be very informative here. In general "I feel like it takes some time on some directories" isn't helpful: "it takes 13 seconds on 500 files, but 100 seconds on 1000 files" is very useful by contrast (it tells you that doubling the number of files more than doubles the time, which wouldn't make sense for this task).
My BlogTwitter: @ajhicks1992