2024 Note on Registrations

Shadowcat · 2024-03-20 04:12:22

Shadowcat
Dragon king
Offline

From: canada
Registered: 2005-04-05
Posts: 6,159
User Karma: 129

So interestingly enough, here, using those coordinates and right clicking using NVDA doesn't seem to actually do anything at all. What I can do, though, is physically move the mouse pointer around a little bit then actually right click. I don't get any options to open the image in a new tab, but it opens an immersive view dialogue that basically ends up doing the same thing. I wonder why it's different?

"You know nothing of death... allow me to teach you!" Dreadlich Tamsin
Download the latest version of my Bokura no Daibouken 3 guide here.

Exodus · 2024-03-20 04:29:23

Exodus
Hunter grunt
Offline

Registered: 2012-01-19
Posts: 1,318
User Karma: 403

The immersive view dialogue is what I get when I left click, so that's pretty strange.

Every record has been destroyed or falsified, every book rewritten, every picture has been repainted, every statue and street building has been renamed, every date has been altered. And the process is continuing day by day and minute by minute. History has stopped. Nothing exists except an endless present in which the Party is always right.

defender · 2024-03-20 11:37:04

defender
Fainting adventurer
Offline

From: Southwestern United States
Registered: 2012-01-13
Posts: 6,435
User Karma: 1,657

@Exodus
Yeah that's fair. I used to use an image downloader for this, but it can be a huge pain sorting out all the crazy image link names and what's actually useful. Ironically, an AI might be able to help with that. LOL

PREPARE
YOUR
ANUS!
https://freesound.org/people/SilverIllu … nds/546960

jsquared · 2024-03-25 09:28:26

jsquared
Galaxy stranger
Offline

From: Kalamazoo, MI
Registered: 2005-12-06
Posts: 81
User Karma: 13

I use
Ecommerce Image Downloader Plus which is a Chrome extension that will let me download the main product image for Amazon through links at the bottom of the page. You can then have these images described.
https://chromewebstore.google.com/detai … nbgbnookii

A T Guys Games
http://www.atguys.com/software

Zersiax · 2024-03-26 21:15:38

Zersiax
red potter
Offline

Registered: 2010-05-05
Posts: 704
User Karma: 70

Version on GitHub now does support gemini and the three claude 3 models. So I guess you can test out if and how much it is better.

ChipsAhoyMcCoy · 2024-03-27 11:35:37

ChipsAhoyMcCoy
Business monkey
Offline

Registered: 2023-12-01
Posts: 201
User Karma: 22

Hey guys, I'm having an issue here with updating the addon. On the GitHub page, it mentiones that the latest version should have a dropdown where you can select which model to use. The thing is, in the NVDA Settings underneath the AI Content Describer, I have no such dropdown at all.

Is the newest version which supports this some sort of beta versoin? I see the latest as the 2024.03.13 version, but that's about it. Any idea what's going on?

Zersiax · 2024-03-27 12:30:08

Zersiax
red potter
Offline

Registered: 2010-05-05
Posts: 704
User Karma: 70

Just not an official release yet. Clone the repository itself and ccopy the files in the addon directory, then update the manifest if required. Or just wait a bit, author will probably release it soonish.

ChipsAhoyMcCoy · 2024-03-28 03:17:18

ChipsAhoyMcCoy
Business monkey
Offline

Registered: 2023-12-01
Posts: 201
User Karma: 22

Zersiax wrote:

Just not an official release yet. Clone the repository itself and ccopy the files in the addon directory, then update the manifest if required. Or just wait a bit, author will probably release it soonish.

Hey there,

Is this simple to do? I'm an extreme novice with anything relating to programming, so I have never done what you mention here. Thank you!

cartertemm · 2024-03-30 05:01:19

cartertemm
Administrator
Offline

From: Missouri
Registered: 2015-07-04
Posts: 1,072
User Karma: 588

Hello everyone,

Sorry for the silence on this. A new ask from a client of mine meant that my hands have been pretty tied over the last two weeks, but born out of necessity came face detection, modelled after VFO's recently introduced face in view feature.

I have published a pre-release for anyone who is kind or curious enough to test. Here is the changelog.

New in this version:

Added face detection, which will tell you whether you are clearly centered into the frame of your camera. The hotkey NVDA+shift+j is bound by default, and options to select a different device or release the camera so that other apps can use it may be found in the AI content describer context menu.
Added an option and script to take a picture from the onboard webcam.
Added Google's Gemini model.
Added the three major Claude 3 models (Haiku, Sonnet, and Opus).
Added support for Llama.cpp.
Rewrote the AI content describer section of the NVDA settings dialog, making many of preferences model specific. On installation of the new version your settings should automatically be ported over.
Added a model selection submenu to the AI content description context menu.
Changed some default options to make them more logical.
Now, when you trigger a description action, the model in use will be spoken.

I am aware that the way I've designed the model selection means that the prompt field is unnecessarily difficult to get to, a noteworthy regression. Next to bugs, that will be the first thing I'll resolve in the upcoming version, after which will hopefully come multi-turn conversation.

Anyway, that's enough talking. Here's the link to the add-on file directly. AIContentDescriber-2024.03.29.nvda-addon.

I'm looking forward to getting everyone's feedback!

Feel free to
check me out on github
or
follow me on twitter

rings2006 · 2024-03-30 05:11:10

rings2006
outbackstranaut
Offline

From: surrey, british columbia, cana
Registered: 2019-11-19
Posts: 2,252
User Karma: 36

really, really wish this was free

i am a system, i have headmates, and that is my life, and my discord is rings2006wilson#8609

ChipsAhoyMcCoy · 2024-03-30 06:42:35

ChipsAhoyMcCoy
Business monkey
Offline

Registered: 2023-12-01
Posts: 201
User Karma: 22

cartertemm wrote:

Hello everyone,
Sorry for the silence on this. A new ask from a client of mine meant that my hands have been pretty tied over the last two weeks, but born out of necessity came face detection, modelled after VFO's recently introduced face in view feature.
I have published a pre-release for anyone who is kind or curious enough to test. Here is the changelog.
New in this version:
Added face detection, which will tell you whether you are clearly centered into the frame of your camera. The hotkey NVDA+shift+j is bound by default, and options to select a different device or release the camera so that other apps can use it may be found in the AI content describer context menu.
Added an option and script to take a picture from the onboard webcam.
Added Google's Gemini model.
Added the three major Claude 3 models (Haiku, Sonnet, and Opus).
Added support for Llama.cpp.
Rewrote the AI content describer section of the NVDA settings dialog, making many of preferences model specific. On installation of the new version your settings should automatically be ported over.
Added a model selection submenu to the AI content description context menu.
Changed some default options to make them more logical.
Now, when you trigger a description action, the model in use will be spoken.
I am aware that the way I've designed the model selection means that the prompt field is unnecessarily difficult to get to, a noteworthy regression. Next to bugs, that will be the first thing I'll resolve in the upcoming version, after which will hopefully come multi-turn conversation.
Anyway, that's enough talking. Here's the link to the add-on file directly. AIContentDescriber-2024.03.29.nvda-addon.
I'm looking forward to getting everyone's feedback!

Yeeeees, you are a godsend. Thank you so much for this! I'm extremely excited to try out Claude 3 Haiku. <3

zakc93 · 2024-03-30 08:13:07

zakc93
Altered being
Offline

From: South Africa
Registered: 2011-04-23
Posts: 924
User Karma: 137

@cartertemm,
Would you consider adding an option to type in a prompt upon triggering one of the options? Sometimes you encounter an image that you want to know something specific about, and it would be easier to do that than edit the default prompt each time. The simplest might be a checkbox in the settings that when checked will bring up a TextEntryDialog with the default prompt whenever an image description is done.

cartertemm · 2024-04-01 06:00:41

cartertemm
Administrator
Offline

From: Missouri
Registered: 2015-07-04
Posts: 1,072
User Karma: 588

zakc93 wrote:

@cartertemm,
Would you consider adding an option to type in a prompt upon triggering one of the options? Sometimes you encounter an image that you want to know something specific about, and it would be easier to do that than edit the default prompt each time. The simplest might be a checkbox in the settings that when checked will bring up a TextEntryDialog with the default prompt whenever an image description is done.

Totally, you are actually spot on with the implementation I have planned. Except there will also be a list of saved prompts. I've noticed that most people seem to cycle through instruction sets depending on the task at hand i.e. recording a professional quality video will demand different feedback from attempting to deduce what an inaccessible button does, so the goal is to make it as straightforward as possible to toggle between them.

Another thing I've noticed is that while some of us really enjoy prompt engineering for the sake of seeing what we can squeeze out of the different models, the vast majority of this addons users just want something that works. I think there would be immense value in a repository of good prompts for different tasks broken down by model and maintained by contributions from the community. Along with demonstrating what is all possible it would be good to document what works and where, so people aren't expected to pay for a model that may not be optimal for their primary case. I have a couple describe selfie prompts that work spectacularly in Be My AI and Claud, but Gemini completely chokes on the one I use for comics. They're interesting like that.

Feel free to
check me out on github
or
follow me on twitter

ChipsAhoyMcCoy · 2024-04-01 10:57:08

ChipsAhoyMcCoy
Business monkey
Offline

Registered: 2023-12-01
Posts: 201
User Karma: 22

cartertemm wrote:

zakc93 wrote:
@cartertemm,
Would you consider adding an option to type in a prompt upon triggering one of the options? Sometimes you encounter an image that you want to know something specific about, and it would be easier to do that than edit the default prompt each time. The simplest might be a checkbox in the settings that when checked will bring up a TextEntryDialog with the default prompt whenever an image description is done.
Totally, you are actually spot on with the implementation I have planned. Except there will also be a list of saved prompts. I've noticed that most people seem to cycle through instruction sets depending on the task at hand i.e. recording a professional quality video will demand different feedback from attempting to deduce what an inaccessible button does, so the goal is to make it as straightforward as possible to toggle between them.
Another thing I've noticed is that while some of us really enjoy prompt engineering for the sake of seeing what we can squeeze out of the different models, the vast majority of this addons users just want something that works. I think there would be immense value in a repository of good prompts for different tasks broken down by model and maintained by contributions from the community. Along with demonstrating what is all possible it would be good to document what works and where, so people aren't expected to pay for a model that may not be optimal for their primary case. I have a couple describe selfie prompts that work spectacularly in Be My AI and Claud, but Gemini completely chokes on the one I use for comics. They're interesting like that.

In regards to this implementation, I thought something that could be pretty neat would be some sort of way to cycle between prompts like how you can choose different speech synthesis settings on NVDA. E.g, holding down the NVDA key, another key, and using the arrow keys to switch between prompt profiles.

The example I thought of was, say, you're playing a video game, and you have a prompt specifically for navigation, and one for reading the interface. You label them "Navigation", and "GUI". And you could swap between the two modes quickly that way. There could be maybe like three different presets for each model?

Not sure, this is one way I thought could work pretty well.

TheTrueSwampGamer · 2024-04-03 06:00:08

TheTrueSwampGamer
Sarah's social circle
Offline

Registered: 2018-06-20
Posts: 1,906
User Karma: 91

Has any of this been updated on? i've been trying to keep tabs on it, because I have the chatPGT4 addon installed and the key. Also, can you chose to use this with Claud, Gemeni, or chatPGT? I'm probably going to be switching to Claud because of it's higher skill in some other aspects.

rings2006 · 2024-04-03 06:09:32

rings2006
outbackstranaut
Offline

From: surrey, british columbia, cana
Registered: 2019-11-19
Posts: 2,252
User Karma: 36

if you read the topic you would know that yes, you can

i am a system, i have headmates, and that is my life, and my discord is rings2006wilson#8609

Sean-Terry01 · 2024-04-03 07:11:02

Sean-Terry01
Tank Compadre
Offline

Registered: 2010-06-19
Posts: 1,656
User Karma: 50

Here's a question. Is there a way, or could a way be implemented so you can get information about an overlay text dialogue? For example. when I use the Look Up Anything mod in Stardew Valley, I then use AICD to read the screen and it describes the scene in game. Then it says there is an overlay and it reads what is in the overlay. Then it goes back to describing the scene. So, there are times when I would just want it to tell me the info from the mod and that's it. If you want, I can post an example of this so you can see what I mean.

cartertemm · 2024-04-09 02:00:10

cartertemm
Administrator
Offline

From: Missouri
Registered: 2015-07-04
Posts: 1,072
User Karma: 588

@Sean-Terry01

I'm not too sure, and it might depend on the model, but you should be able to add something like "ignore the overlay and anything contained inside it" to your prompt.

Feel free to
check me out on github
or
follow me on twitter

Sean-Terry01 · 2024-04-09 02:15:05

Sean-Terry01
Tank Compadre
Offline

Registered: 2010-06-19
Posts: 1,656
User Karma: 50

I thought I might have heard a bit back that you could ask a question from the result you get? Or, is that still being worked on?

jsquared · 2024-04-09 08:13:22

jsquared
Galaxy stranger
Offline

From: Kalamazoo, MI
Registered: 2005-12-06
Posts: 81
User Karma: 13

Sean-Terry01 wrote:

I thought I might have heard a bit back that you could ask a question from the result you get? Or, is that still being worked on?

Ya at some point, but for now you can edit the prompt under preferences and tell it what you do and don't want included. For instance, I will change it for YouTube to "just describe the contents of the video; ignore any controls or other page elements". The models will usually follow these types of instructions.

A T Guys Games
http://www.atguys.com/software

rings2006 · 2024-04-12 04:15:32

rings2006
outbackstranaut
Offline

From: surrey, british columbia, cana
Registered: 2019-11-19
Posts: 2,252
User Karma: 36

would love the option to enter a prompt when you hit the hotkey to do one of the actions, and the ability to ask things about the picture after submitting it and stuff, also hwo do i get as much details as jaws picture smart ai

i am a system, i have headmates, and that is my life, and my discord is rings2006wilson#8609

assault_freak · 2024-04-12 18:00:06

assault_freak
You have defeated this forum and won a custom rank
Offline

From: Canada
Registered: 2005-03-06
Posts: 10,692
User Karma: 393

I just updated the addon, but I'm not able to see different models to try experimenting... which version should I be using to get access to this?

Discord: clemchowder633

defender · 2024-04-12 18:12:17

defender
Fainting adventurer
Offline

From: Southwestern United States
Registered: 2012-01-13
Posts: 6,435
User Karma: 1,657

Yeah Having a series of saved preset prompts that can be accessed with the 1 through 0 keys for instance, either in combination with the NVDA key and other modifiers or in a layer setup would be incredible!
I feel bad asking for so much without giving anything in return though. A paypal.me or ko-fi link would be nice.

PREPARE
YOUR
ANUS!
https://freesound.org/people/SilverIllu … nds/546960

the_ruler_of_dark_forces · 2024-04-12 20:51:08

the_ruler_of_dark_forces
hero caller
Offline

From: Estonia, Tartu
Registered: 2005-03-05
Posts: 359
User Karma: 20

Yeah, I would also like to have the ability to toggle between different prompts. I guess reserving the whole number row would interfere with some other keystrokes, but I guess setting up 2, 3 or 4 prompts and toggling between them with a single keystroke is enough for me.
As an Estonian I would often like to get some images with Estonian text described, however the English descriptions are much more accurate, so switching the prompts would make the usage much much more handy!

rings2006 · 2024-04-14 07:37:04

rings2006
outbackstranaut
Offline

From: surrey, british columbia, cana
Registered: 2019-11-19
Posts: 2,252
User Karma: 36

so you know the nvda ocr vewer? not sure of the name, but the vewer used with nvda plus r for ocr that doesn't quite remove focus, the ai addon should be able to use that

i am a system, i have headmates, and that is my life, and my discord is rings2006wilson#8609

2024 Note on Registrations

AI Content Describer for NVDA (Page 6 of 7)

Posts: 126 to 150 of 160

#126 Reply by Shadowcat 2024-03-20 04:12:22

#127 Reply by Exodus 2024-03-20 04:29:23

#128 Reply by defender 2024-03-20 11:37:04

#129 Reply by jsquared 2024-03-25 09:28:26

#130 Reply by Zersiax 2024-03-26 21:15:38

#131 Reply by ChipsAhoyMcCoy 2024-03-27 11:35:37

#132 Reply by Zersiax 2024-03-27 12:30:08

#133 Reply by ChipsAhoyMcCoy 2024-03-28 03:17:18

#134 Reply by cartertemm 2024-03-30 05:01:19 (edited by cartertemm 2024-03-30 05:05:51)

#135 Reply by rings2006 2024-03-30 05:11:10

#136 Reply by ChipsAhoyMcCoy 2024-03-30 06:42:35

#137 Reply by zakc93 2024-03-30 08:13:07

#138 Reply by cartertemm 2024-04-01 06:00:41

#139 Reply by ChipsAhoyMcCoy 2024-04-01 10:57:08

#140 Reply by TheTrueSwampGamer 2024-04-03 06:00:08

#141 Reply by rings2006 2024-04-03 06:09:32

#142 Reply by Sean-Terry01 2024-04-03 07:11:02

#143 Reply by cartertemm 2024-04-09 02:00:10

#144 Reply by Sean-Terry01 2024-04-09 02:15:05

#145 Reply by jsquared 2024-04-09 08:13:22

#146 Reply by rings2006 2024-04-12 04:15:32 (edited by rings2006 2024-04-12 04:26:40)

#147 Reply by assault_freak 2024-04-12 18:00:06

#148 Reply by defender 2024-04-12 18:12:17 (edited by defender 2024-04-12 18:58:51)

#149 Reply by the_ruler_of_dark_forces 2024-04-12 20:51:08

#150 Reply by rings2006 2024-04-14 07:37:04

Posts: 126 to 150 of 160