The Site > Site Suggestions & Support

Improving search

(1/2) > >>

knnn:
I think it's not blasphemy to declare that the search function on this site is sorely lacking  in that using the search functionality will yield very limited (and sometimes completely wrong) results.

Now I can totally understand why (e.g.) Google search might have been disabled from the forums -- it drives a lot of unnecessary traffic, and possibly increases the cost of running the site (having to pay for page-views every time a web-crawler goes through every page).

That said, I think it might be within my capability to put together a private, local search for the forums.  Depending on the technology used, it would not necessarily be generally accessible and could be updated only as often as requested. 

This is mainly just me having fun playing around with some tech tools and maybe coming up with something remotely useful.  The things I am asking permission for are:

1) Any fast websearch needs to "crawl" every post on the forums periodically to maintain its list of searchable keywords.  This means a potentially large amount (tens of thousands?) of queries to the forum server every time the list gets updated.  Is it ok to do this (say once a month)?

2) While anyone can post a link online to any topic on the (publicly accessible parts of the) forums, this would be essentially making large parts of the forum a lot more accessible to the public in a targeted manner.  Is there a problem with this?

---

Note that I only have vague random thoughts about how to proceed about doing this.  I am asking first just to make completely sure there isn't any problem.  Otherwise, I can always go and play with other toys.   ;)

Iam that kemmler:
using toys or creating crawlers would just add to cost of hosting. If you had access to the actual database and if all the proper indexing was already done - then a page with an interface to search on would be actually be pretty simple to do.

Many folks have wanted to help shape stuff around here - but Iago rarely acknowledges or goes with tech advice.

If you really wanted to build a tool - you could just load up SMF and check out the data structure and write the page as proof of concept. I'm pert damn sure it'll get shot down because of the hidden forums on this site.

(I own a company that does custom programming and I have offered to help before)

Serack:

--- Quote from: Iam that kemmler on July 07, 2015, 01:23:45 AM ---using toys or creating crawlers would just add to cost of hosting. If you had access to the actual database and if all the proper indexing was already done - then a page with an interface to search on would be actually be pretty simple to do.

Many folks have wanted to help shape stuff around here - but Iago rarely acknowledges or goes with tech advice.

If you really wanted to build a tool - you could just load up SMF and check out the data structure and write the page as proof of concept. I'm pert damn sure it'll get shot down because of the hidden forums on this site.

(I own a company that does custom programming and I have offered to help before)

--- End quote ---

Edit:  Doh, accidentally hit post without writing anything

Yah, Iago resists doing changes here.

That said, he has commented in the past that on top of how heinous it is on the user side, the search function is a huge use of server resources, which is why he limits it to one search per 60 seconds, and only 25 hits. 

I'm not quite sure what knnn is proposing, but it kinda sounds like he might be asking to crawl the site, and host a separate search function, which could eliminate the need for the current search function and it's toll on the server...

Change is unlikely.

knnn:

--- Quote from: Serack on July 09, 2015, 02:57:31 PM ---
I'm not quite sure what knnn is proposing, but it kinda sounds like he might be asking to crawl the site, and host a separate search function, which could eliminate the need for the current search function and it's toll on the server...

Change is unlikely.

--- End quote ---

Exactly.  The idea would be to crawl the site infrequently (say once a month), and cache all the posts on a different site with links to the original posts.  That way, any search will just use resources from the alternate site.   Sure, people would still be following the links to posts on this site, but that shouldn't be any different than some poor user trying to search through the last 100 pages of posts looking for something specific. 

The only real disadvantage I can see with this scheme (other than the aforementioned need to crawl every post on this site periodically) is that with a search that doesn't suck people might begin to expect they can actually find old posts and maybe use the site more often than we want them to.

knnn:
Frankly, I'm tempted to just go and test my theories on the DF reference subforum.  It changes quite infrequently and is an order of magnitude smaller than the full site, so I can probably do a one-time "one thread an hour" indexing that would barely even show up as background noise.   

This would also be much easier to host separately/locally (*way* less space to take up), thus allowing me to play with stuff safely.  And if it actually worked, would be a reasonable proof-of-concept test for the full treatment, so we wouldn't just be yakking about pie-in-the-sky ideas.

Navigation

[0] Message Index

[#] Next page

Go to full version