2021-03-25 08:50:08

Hello. So I am messing with a function that will randomize sub strings in a string.
I went to the <park, bar, church> and ordered a cheeseburger.
If I wanted to, I could make a list with these strings and call it place, so I can now do the following.
I went to the <place> and ordered a cheeseburger.
Now let's say I want to have more than 1 layer of randomization at a time.
I went to the <place> and bought a <<food>, <computer>, <game>, <car>>, and it was great.
This makes no sense, but not the point.

So I have a function using re.split to find instances of the less and grater than signs and parsing them into a list.
However, that doesn't work with multiple layers. Instead of choosing one of the 4 lists and only randomizing that 1 list, it randomizes all 4 lists, then chooses. This is great if I want to see every path that it randomized, but in this case, I don't care about that, so I do not want it to waste extra time and randomization cycles that won't be seen.
I have determined re.split can not help with this, and what I need is a character mapping for every instance of the less than and grater than signs with the following 4 values.

The symbol found.
The level of the instance, aka the number of less than or greater than signs in a row without the other symbol closing the level.
The number of the instance in the current level, aka levels can vary through out the string, and the instance will ignore it.
And the position in the string the instance occurs.

This would be a whole module on its own. Organizing the data, writing functions for adding and subtracting levels, finding instances and levels, etc.
So I was wondering if there is a module or a library that uses regular expressions that already has a character mapping system like I have described. I can't be the first person who has needed this.
If there is a much simpler way than needing this whole system, please let me know.
If not, at least this would be an interesting challenge to create. But I figured I would ask here to save my time before I dive into this.

2021-03-25 10:16:53

If it were me, and I may need a little something like this later, i'd probably store the original text string and optional keywords all in a 2d array. Something like:

box = ["This is my %s, its %s",["clock","bat"],["awesome!","garbage."]]

It kind of depends on how you plan on using it, such as printing or some other processing method, but the idea is you can parse the "%s" in the string by whatever is in the corresponding list. So, maybe even something like:

box = [["clock","bat"],["awesome!","garbage."]]
text = "This is my "+box[0][random.randint(0,1)]+" its "+box[1][random.randint(0,1)]
-BrushTone v1.3.3: Accessible Paint Tool
-AudiMesh3D v1.0.0: Accessible 3D Model Viewer

2021-03-25 13:22:44

OK, two things: Firstly, you could probably use Jinja2 out of the box.

You may need to write your own filter, but then you could do:

"I {{ walked~ran~skipped|random_word }} to the {{ park~gym~bar~field|random_word }}."

Or you could simply use the re.sub method. It's a really interesting thing you're talking about. Maybe something I'd like to add into Earwax eventually, if you can come up with something awesome.

-----
I have code on GitHub

2021-03-25 15:57:29

Yeah, Jinja2 is the easiest bet here.  I can't type Jinja2 off memory anymore, but the entire system to do what you want should be around 20 lines max.  You just make a function that picks a random string and call it in the template.  Jinja2 can do a lot of other stuff too, for example interpolate literal values, do if statements, a whole host of things.  The Synthizer HRTF processor uses it to turn Python lists into C arrays that it writes out to a .c file, for example.

Your biggest problem (eventually, not now) will be that this is slow no matter how you do it because of the string concatenation and such.  Even Jinja2 is only going to help with that so much though, for complex cases, it will be faster than doing it yourself.  You'll need to cache them in some fashion.  This is going to depend on what you want to do, but think storing them on instance variables that you check for first, or something.

For instance:

class RandomString:
    def __init__(self, template):
        self.template = template
        self.generated = None

    # ...
    def get(self):
        if self.generated is not None:
            return self.generated
        self.generated = self.generate_sstring()

Then use that instead of strings anywhere you want one of these.

It is important to abstract the caching like that.  If you try to do it without an abstraction you will fail.

@3
Is the ~ thing valid syntax?  Curious if that's pseudocode or a Jinja feature I don't know.

My Blog
Twitter: @ajhicks1992

2021-03-25 19:06:04

I have installed jinja2. Let's see what I can do with this. It might be above my level, but I won't know until I try. I find this concept interesting, so I'll try to find or make the result I am looking for.
@4 I copied the class into a document. Thanks.

2021-03-25 19:48:28

Jinja2 is really easy.  Just read their official tutorials.  The templating language is designed so that web frontend people with no programming experience can use it, and the hardest thing about it is whitespace control (they also have a chapter explaining that. You're doing single-line strings. It, therefore, shouldn't be a problem.  But it can let you write your single-line strings across multiple lines and may be useful for that reason).

My Blog
Twitter: @ajhicks1992

2021-03-25 20:12:43 (edited by Zarvox 2021-03-25 23:47:32)

I found this page which can help me get started. It has a lot of examples for syntax.
https://zetcode.com/python/jinja/
And also @4 yes that is the correct syntax, "| random_word" meaning I apply a filter called random_word to the render function.

2021-03-26 14:02:14

@4
No, I just wanted something that could split a string into a list, that wasn't standard punctuation. I'm not sure you can pass lists around with Jinja, without them being in the environment to start with.

@7
You want to make your random_word function a filter. I believe on your environment, there is a filters dictionary, so you can add it there. I thought there was a decorator too, but that might have been with Flask or Django.

-----
I have code on GitHub

2021-03-27 13:56:26

@8 I am using a template directly, not an environment, so that dictionary may not exist, but we'll see. I know my random_word filter has to be a function, but I can't see the list of properties in the template, let alone find the template module in the jinaja2 folder. But I'll try dir(jinaj2.template) to see if I can get more info. Also I have no idea what to write in my function. So I have put this issue aside temporarily.

2021-03-27 14:15:10

Try this code:

from random import choice
from typing import List

from jinja2 import Environment, Template

e: Environment = Environment()


def random_word(input: str) -> str:
    """Get a random string."""
    strings: List[str] = input.split("~")
    return choice(strings)


e.filters["random_word"] = random_word

t: Template = e.from_string("Hello {{ 'world~people~folks'|random_word }}.")
print(t.render())

You could probably use regular expressions to make it cleaner, in fact I'm certain you could, but that works at least.

-----
I have code on GitHub

2021-03-27 17:06:28

Don't use dir.  They have tons and tons and tons of documentation.  In general learning something with dir should be your last choice, not your first.

My Blog
Twitter: @ajhicks1992

2021-03-28 04:26:15

I looked at the jinja2 documentation for a minute, but was more concentrated on trying to find how to create a template directly, rather than read it.
@10 your example uses the tilde symbol, what would happen if I used more than 1 symbol? I guess I could experiment myself instead of asking here, but why not.

2021-03-28 05:23:55

yes, but in doing so you missed the other 90% that you needed to know how it works, or at least that's what it appears.  If you always only focus on the specific thing you're trying to do, you'll miss the forest for the trees.  This and your other current thread illustrate a problem: rather than learning your tools, you only seem to learn the bare minimum you need to make the program work.  Reading the whole manual start to finish may seem boring, but it generally doesn't take that long and it usually saves a bunch of time.

My Blog
Twitter: @ajhicks1992

2021-04-09 07:12:29

Camlorn, you were right. I made this character mapping function in about 20 lines. Super simple.

phrase=
"This test is <more complicated <because of the> <multiple <layers> in> the sentence>."

function:

def character_map(string, chars):
 instances= []
 level=0
 for char in chars:
  #for each character that we have, add an instance variable
  char.update({"instance": 0})
 for x in range(len(string)):
  symbol= #custom function to retrieve the position in the list of dictionaries
  #the current character in the string is a symbol match!
  if symbol>=0:
   symbol= chars[symbol]
   symbol["instance"]+=1
   if symbol["direction"]== "open":
    level+=1
   instances.append({"symbol": string[x], "position": x, "level": level, "instance": symbol["instance"]})
   if symbol["direction"]== "close":
    level-=1
 return instances
results= character_map(phrase, [{"symbol": "<", "direction": "open"}, {"symbol": ">", "direction": "close"}])

1: (dictionary)
symbol: <
position: 13
level: 1
instance: 1
2: (dictionary)
symbol: <
position: 31
level: 2
instance: 2
3: (dictionary)
symbol: >
position: 46
level: 2
instance: 1
4: (dictionary)
symbol: <
position: 48
level: 2
instance: 3
5: (dictionary)
symbol: <
position: 58
level: 3
instance: 4
6: (dictionary)
symbol: >
position: 65
level: 3
instance: 2
7: (dictionary)
symbol: >
position: 69
level: 2
instance: 3
8: (dictionary)
symbol: >
position: 83
level: 1
instance: 4

How about them apples! Now I need to figure out how to use this to do what I want.

2021-04-09 07:32:21

Unfortunately, that was the easy part. Now I have to figure out the best way to split the string up into sub strings and work with those individually and piecing them back together for the result. But at least I have the character map out of the way, that's what I was most overwhelmed by. THat didn't take long btw.

2021-04-09 17:26:17

Yeah, but using Jinja2 should still be simpler than that thing, you know.

My Blog
Twitter: @ajhicks1992

2021-04-20 09:37:49 (edited by Zarvox 2021-04-20 09:40:28)

An update here... I have finished my multi layer randomization quest!
You know that function I posted in post 14? Yeah... I didn't need that. I was overthinking this whole thing. I didn't need a character map at all. The closest I came to a character map is increasing and decreasing levels to determine if separators between options were in the root in the sub string or if they were nested separators.
That being said, regular expression is a hell of a tool that I could use to help me in the future. It looks terrifying, but if I can learn it, that will give me huge advantages as a programmer. I'm good with memorization, but I'm terrible with putting the pieces together. I'm getting better, but only slowly. I'm starting to build recursive functions, and those take a lot of brain power, so I'm doing something right lol, just not very fast right now.

2021-04-20 11:26:07

@17
Good for you.

Regular expressions are amazingly powerful! Ever write a MUD client? Regexp are your friend. Need to validate input like email addresses? They'll help you there too. Advanced find and replace? Replacing cases? All regexp.

Also, it's good you spent time writing that function, even though you don't need it. All code is practise code as they say (actually I just made it up, but whatever).

-----
I have code on GitHub

2021-04-20 13:05:41 (edited by Zarvox 2021-04-20 13:07:03)

I wrote more functions for the character map before abandoning it. I saved it in a file just because. I'll probably never work on the character map beta again, especially if I take the time to learn regular expressions, but it was still good mental exercise because you need to know how to organize the data, how to gather the data you need, how to convert it into other formats that you can use for different purposes, etc.
However I'm glad my problem was much simpler than I initially thought.

2021-04-20 18:42:59 (edited by camlorn 2021-04-20 19:17:57)

Regular expressions can't count and may be unsuitable for this problem, if you want to nest more than one level, e.g. "<hello <good morning | how are you> | <goodbye <see you tomorrow | have a nice evening>>".  There are some extensions to handle this but I'm not sure if Python has them.  Again: Jinja is your friend.  But if you don't want Jinja and you want to go beyond one level, you will either need to hand-write a parser or use ply.

My Blog
Twitter: @ajhicks1992