Copy link to clipboard
Copied
In the thread http://forums.adobe.com/thread/740631 (TIFF vs PSD CS5) in the InDesign forum, I've received several posts in email today that do not appear in the web interface. The latest post in the web interface is:
http://forums.adobe.com/message/3668725 (12:40pm Pacific, RobertoBlake)
According to the emails, from the thread, there have been five subsequent posts, beginning with
http://forums.adobe.com/message/3668797 (12:53pm Pacific, macinbytes)
and ending with
http://forums.adobe.com/message/3669007 (2:02pm Pacific, RobertoBlake)
I can only assume there is some kind of database corruption? All times above are the email timestamps. In the case of the first one,
the web timestamp is acftually 12:39pm Pacific, I guess they don't match 1:1 at all times.
From Jive for John and Jochem:
...
Many thanks to your user, that's great data - I found that node 10.137.24.43 (42p) had dropped out of the cluster and disabled clustering. It's likely any posts made from this node wouldn't make it to the caches of the other nodes, which would explain why the caches were getting out of sync. I've resolved that, the node has rejoined the cluster and I've cleared the caches again to sync them back up. At this point the caches shouldn't be getting out of sync anymore;
Copy link to clipboard
Copied
There have been some oddities over the last two days, with regards to replies showing. I've had some magically appear, while I was reading the "last post," and their date/time stamps indicate that they were posted, before I began reading.
There is work being done by Adobe, so maybe some part of that is affecting the display.
Also, if a spammer posts to a thread (as an example), and one has e-mail notification ON, they will be notified, but Adobe-Admin might remove that/those offending post(s), before one gets to the thread. I've had similar happen, with the (Updated) flag in the main page, only to find that the last post that I saw, some days before, is all that is now present. I've found some references to those threads in the Spam Report thread here, and then know what happened. Not saying that this is the cause here, but it's a possibility.
Out of curiosity, on the InDesign main page, with the thread listings, do you see the "proper" reply count, with the poster's screen name, that matches up to your e-mail notification?
Maybe a MOD, or Adobe-Admin. will have some comments on the thread that you cited, and then we'll know.
Good luck,
Hunt
Copy link to clipboard
Copied
I just had the anomaly with the (Updated) flag in this THREAD. I had seen it earlier today, and it then showed as having new replies, but did not. I also noted that the Last Post button did not function, and I had to scroll. Not sure if there is any relationship.
Good luck,
Hunt
Copy link to clipboard
Copied
Same exact thing with this THREAD. It showed as (Updated), but the last post was mine, on April 21. Here, the Last Post worked fine?
Oddities? Yes, no doubt.
Hunt
Copy link to clipboard
Copied
These were definitely not spam posts.
It seems to be happening in other threads.
Indeed, I could not find the thread at all on the main InDesign
page. That's worrisome.
Copy link to clipboard
Copied
I've gotten a couple of "Unexpected Error" messages on posting today (Tuesday, May 10), and when I Cancel, I see that the post has been uploaded. In several threads on several of the Premiere forums, it seems that many posters are getting double, and even triple posts. Not sure if this is related to the Unexpected Error, or something totally removed.
Either way, it does seem that there is wonkiness in the forum today.
Good luck to us all,
Hunt
[Edit] It happened with this reply too!
Copy link to clipboard
Copied
(this posted via the web interface)
These were definitely not spam posts.
It seems to be happening in other threads.
Indeed, I could not find the thread at all on the main InDesign page. That's worrisome.
Copy link to clipboard
Copied
Sounds like one of the server nodes is out of sync. Sending off a support ticket.
Copy link to clipboard
Copied
Any updates on the support ticket?
I continue to see these problems, e.g. in thread
http://forums.adobe.com/thread/850157
talking to server node SGAURWA43P.
Has Jive said anything useful?
Can I provide more information on the problem somehow?
[can anyone tell me how to talk to specific server nodes to look for data inconsistencies?]
Copy link to clipboard
Copied
All of the nodes were restarted a few hours ago.
I see seven replies listed for that discussion. What do you see?
Copy link to clipboard
Copied
I see seven replies listed for that discussion. What do you see?
I see seven as well. But the header indicates a 09:54 Pacific
reply by 'peter at knowhowpro' which is not one of the 7, and I have
Peter's reply in my mailbox.
Copy link to clipboard
Copied
It gets odder. It says 7 replies at the top of the page, but I actually can see 8 replies. Last one is from peter at 9:54...
Copy link to clipboard
Copied
Yes, that is the post that I do not see on the web interface.
I assume you're using a different node...
Copy link to clipboard
Copied
According to the widget in the Testing Forum, I am on 43P.
John
Copy link to clipboard
Copied
I noticed you mentioned 43P, too, a few messages back. Can you confirm that you are still on that node? If we're on the same node and seeing different messages that sounds even worse.
Copy link to clipboard
Copied
Sadly, yes, I still seem to be on 43P and the last post in that thread I see
is reply #7 from curtis368. (I don't suppose there's
a chance you are on a different prefix also ending in 43P?).
I even tried removing the jive.server.info cookie and reloading
the page and still get the same cookie with 43P.
Possibly unrelated but curious to me is that if I use a command-line
web client (w3m) from another system, I get the full proper page
and the cookie says localName=10.137.24.43; and it seems awfully weird
that some hosts might be named with IP addresses and others with names,
and also odd that the two I happened to get both have '43' in them...
Also cleared the browser cache ... no dice.
It looks like you guys are running an BigIP redirector...
maybe that's involved with the problem? Working (w3m) session
has BIGipServerPool_53_ENT2=723028234.22555.0000, broken
Firefox has BIGipServerPool_53_ENT2=739805450.22555.0000.
shrug
Copy link to clipboard
Copied
Adding all of your comments to the support ticket. Probably won't hear anything else back this evening. I'll check back in the morning.
Copy link to clipboard
Copied
Some more data. If I throw away all cookies, I (usually) get balanced
to a new server.
Doing so 7 times from the same IP address, twice I get the full thread
from 10.137.24.43 (no name). The remaining 5 times I get the truncated
thread, twice from 41P and 44P, and once from 43P.
I have to say, this is impressive!
Edit: noname ipaddr is 10.137.24.43 not 10.137.24.4. (cut-and-paste error)
Copy link to clipboard
Copied
Yeah, 7 trials. Not so exciting. So, I ran 1000 repitions of
$ curl -v http://forums.adobe.com/message/3671936#3671936 2>&1| egrep 'title="in response|server.info'
< Set-Cookie: jive.server.info="serverName=forums.adobe.com:serverPort=80:contextPath=:localName=SGAURWA41P:localPort=9000:localAddr=127.0.0.1"; Version=1; Path=/
title="in response to: JWH-NIRC"
title="in response to: JWH-NIRC"
Reporting the server name and the resonses for thread http://forums.adobe.com/message/3671936#3671936, another one that I found to be problematic. Here are the summary stats:
Trials | Node | # responses |
---|---|---|
240 | 10.137.24.43 | 3 responses |
247 | SGAURWA41P | 2 responses |
246 | SGAURWA43P | 2 responses |
266 | SGAURWA44P | 2 responses |
So basically, all the nodes with proper names do not seem to be syncing up with the 10.137.24.43 node, which is presumably misconfigured anyhow, since it has an RFC1918 address as its nodename intead of a node name. It's also plausibly misconfigured since you would think 43P would be the node ending in .43, so something else is messed up, too.
I'd theorize, then, that if a poster gets load-balanced to the 24.43 server, then his/her posts get tossed into lala land until something happens that forces a database quorum sync?
Or something.
Copy link to clipboard
Copied
They (Jive) cleared some caches this evening, attempting to clear this up. I'll pass along this additional test info.
John
Copy link to clipboard
Copied
Thanks. Just in case it wasn't clear, this behavior persists right now. (If you have a mac or a unix machine handy, pasting that curl|egrep line into the shell and hitting up-arrow/return a few times should show you the differing behavior on each node).
I suspect it's a config problem and not [only?] a caching problem, but that may be naive.
Copy link to clipboard
Copied
Yes. Understood. They did the cache clear a few hours ago. Still happening.
Thanks!
Copy link to clipboard
Copied
John Hawkinson wrote:
So basically, all the nodes with proper names do not seem to be syncing up with the 10.137.24.43 node, which is presumably misconfigured anyhow, since it has an RFC1918 address as its nodename intead of a node name. It's also plausibly misconfigured since you would think 43P would be the node ending in .43, so something else is messed up, too.
The nodes refer to Tomcat instances of which there are 4. I have only ever seen 2 IP addresses, so I doubt there is a one-on-one correspondence between node names and IP addresses.
I'd theorize, then, that if a poster gets load-balanced to the 24.43 server, then his/her posts get tossed into lala land until something happens that forces a database quorum sync?
Until cache expiration / resync. There is a single PostgreSQL database but there is a distributed Oracle Coherence cache and I believe every node has its own cache instance. But since there is a single database and that database has proper referential integrity the issues should sort theselves out eventually without data loss or conflicts.
Copy link to clipboard
Copied
There were still issues earlier today, Wed, May 11, up to about Noon PDT, and in some threads (in Premiere Elements forum), replies are now showing, but the reply count is still off in some, though the display of the actual replies seems to have been fixed. Earlier, I would see 0 replies, and add a reply, only to see 1, or more posts, and some time stamped at least an hour before my reply. They would just suddenly appear, as soon as the thread refreshed. A manual refresh (Chrome) did NOT bring those other replies up.
Maybe with the re-syncing, some data (reply count here) was lost in the process?
Good luck,
Hunt
Copy link to clipboard
Copied
John Hawkinson wrote:
[can anyone tell me how to talk to specific server nodes to look for data inconsistencies?]
You can switch nodes at will by modifying the value of the BIGipServerPool_53_ENT2 cookie. I don't know if they have changed, but n 2009 the names / values were:
SGAURWA41P: 706251018.22555.0000
SGAURWA42P: 723028234.22555.0000
SGAURWA43P: 739805450.22555.0000
SGAURWA44P: 756582666.22555.0000