How To Resolve SharePoint 2016 UPA Endpoint Issue

SharePoint 2016 UPA endpoint issue occurs because of the changing of the service instance to different server. Today, SharePoint gave us a little tough time again. Aahh! It is normal if you are a SharePoint Admin.

Problem

Due to some capacity issue, we decided to add more Servers in the farm. After successfully adding the Servers into farm, we shuffled the services from existing Servers to newly added (as per our plan). We successfully moved the Distributed Cache Services, App Management Service, Managed MetaData Service, and Central Admin except the User Profile Service application. When starting the User Profile Service instance on the new Server and stopped it from the existing one, SharePoint threw an error, i.e. News feed not working, and Site content page displayed the following error.

"We're having trouble loading some parts of this page, but your documents aren't affected. While we work on fixing it, you can continue using your documents like you normally would."
SharePoint

If we try to use UserProfile asmx from infopath farm, we get the following error.

“The SOAP response indicates that an error occurred on the server,

Server was unable to process request. ---> UserProfileDBCache_WCFLogging :: ProfileDBCacheServiceClient.GetUserData threw exception: The HTTP service located at http://kf-sp1:32843/3da75a6367304a72a5e80573218d2313/ProfileDBCacheService.svc is unavailable. This could be because the service is too busy or because no endpoint was found listening at the specified address. Please ensure that the address is correct and try accessing the service again later. ---> The remote server returned an error: (503) Server Unavailable. “

Troubleshooting

We checked the ULS logs and found this entry, which clearly told us that SharePoint (for some reason) was still looking for the old Server for UPA Service Instance. 

  1. 06/09/2017 13:04:33.13 w3wp.exe (KF-SP:0x1F10) 0x1254 Document Management Server Reporting avv31 Medium GetUserProfile_RetrieveUser_Cache Start: My Scenario Start d6fff99d-0d14-702b-e436-f35f46ae1347  
  2.   
  3. 06/09/2017 13:04:33.13 w3wp.exe (KF-SP:0x1F10) 0x1254 SharePoint Portal Server User Profiles ajk39 Medium UserProfileDBCache_WCFLogging::Begin ProfileDBCacheServiceClient.GetUserData.ExecuteOnChannel d6fff99d-0d14-702b-e436-f35f46ae1347  
  4.   
  5. 06/09/2017 13:04:33.13 w3wp.exe (KF-SP:0x1F10) 0x1254 SharePoint Portal Server User Profiles arkm4 Medium ChannelInvoke::GetUserData::1 -- Executing code block on endpoint [http://kf-sp1:32843/3da75a6367304a72a5e80573218d2313/ProfileDBCacheService.svc]. d6fff99d-0d14-702b-e436-f35f46ae1347  
  6.   
  7. 06/09/2017 13:04:33.13 w3wp.exe (KF-SP:0x1F10) 0x1254 SharePoint Foundation Topology e5mc Medium WcfSendRequest: RemoteAddress: 'http://kf-sp1:32843/3da75a6367304a72a5e80573218d2313/ProfileDBCacheService.svc' Channel: 'Microsoft.Office.Server.UserProfiles.IProfileDBCacheService' Action: 'http://Microsoft.Office.Server.UserProfiles/GetUserData' MessageId: 'urn:uuid:0d5dd405-c724-464c-ad74-b44dd652701b' d6fff99d-0d14-702b-e436-f35f46ae1347  
  8.   
  9. 06/09/2017 13:04:34.15 w3wp.exe (KF-SP1:0x1F10) 0x1254 SharePoint Portal Server User Profiles arkm8 High ChannelInvoke::GetUserData::1 -- CommunicationException occurred: System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at http://kf-sp1:32843/3da75a6367304a72a5e80573218d2313/ProfileDBCacheService.svc that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 19.192.168.188:32843 at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress) at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress&... d6fff99d-0d14-702b-e436-f35f46ae1347   

We tried the following things to resolve the issue.

  • Run IIS reset on the Server where we stop the UPA instance and other Apps & DC servers but not on WFE.
  • Clear the config cache on all servers in the farm. We even saw the xml file in the config cache folders contain the correct endpoint for UPA.
  • Restart the timer services on the all servers.
  • Make sure topology service is working on all servers.
  • Run the “Application Addresses Refresh Job” timer job (which run every 15 minutes). This job is responsible to update the connection address in config.
  • I even removed that server from the farm (yes, you read correctly, I removed the server from farm) but still no luck.

Resolution

Finally, we cracked it. Looks like the endpoint was cached on the Web Front-end Servers. As soon as we reset the IIS on all Web front-ends, SharePoint started working as expected and picking the endpoint from the new Servers.

In the future, if you see that SharePoint is not picking the new endpoint and keep complaining the existing errors, please do the IIS reset on the web front.