Skip to content

empty path when using CaptureAll results to [""] #1243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
domenkozar opened this issue Nov 11, 2019 · 18 comments · Fixed by #1516
Closed

empty path when using CaptureAll results to [""] #1243

domenkozar opened this issue Nov 11, 2019 · 18 comments · Fixed by #1516

Comments

@domenkozar
Copy link
Contributor

domenkozar commented Nov 11, 2019

Expected would be [].

@guygastineau
Copy link
Contributor

Hi, I just started using Servant, and I am interested in trying to contribute this fix.

I am pretty comfortable with Haskell. I have found the definition for CaptureAll in the Capture.hs file. I am assuming that this issue comes from something else further up API that is handling it though?

Also, I am familiar with how higher kinded types when dealing with type classes and their instanciation, but I am not familiar with some of the cool type wizardry going on with the DataKinds and PolyKinds language extensions. If anyone knows of good resources to help get me up to speed on those I would appreciate it/ turn it into more contributions 😈

Thanks, and hopefully I can figure out how to fix this ;)

@guygastineau
Copy link
Contributor

Also, @domenkozar if you could give some information about how to reproduce this I would be grateful.

By empty path you mean when a client calls the API with nothing to capture the handler is given [""] instead of [] I assume?

PS. I found where servant-server/src/Servant/Server/Internal.hs deals with CaptureAll, so I think I am closer to the code that needs changing. I will try to find and read the contributing docs too before I ask to many questions that are already answered there.

@guygastineau
Copy link
Contributor

guygastineau commented Jan 2, 2020

Is it related to this part of the test specification? -- taken from line 291 of the file servant-server/test/Servant/ServerSpec.hs:

  it "can capture when there are no elements in 'pathInfo'" $ do
        response <- get "/"
        liftIO $ decode' (simpleBody response) `shouldBe` Just beholder

In the spec tests it is capturing Integer, and the case expression matches 0 to beholder. This suggests that CaptureAll is designed to capture even when the path is empty.

So, it seems like it is defaulting to 0. 0 is mempty for instances of Sum instanciating Monoid I believe. I assume that String and Text have Semigroup and Monoid instances, but that is an assumption. If the behavior of CaptureAll is to produce a singleton list of mempty when no path is given, then [""] is the correct behavior (although it might seem less intuitive than such behavior for Nat or Int).

It seems like I must be on to something here, but I still haven't found the source lines that will confirm my suspicions. Looking forward to solving this, but I will make a concerted effort to wait for a response before continuing to blow up this thread.


EDIT: I believe I have found the relevant source.

At line 497 of file servant/src/Servant/Links.hs I encountered this instance definition of HasLink for CaptureAll (I still need to figure out what the HasLink tpye class does).

instance (ToHttpApiData v, HasLink sub)
    => HasLink (CaptureAll sym v :> sub)
  where
    type MkLink (CaptureAll sym v :> sub) a = [v] -> MkLink sub a
    toLink toA _ l vs = toLink toA (Proxy :: Proxy sub) $
        foldl' (flip $ addSegment . escaped . Text.unpack . toUrlPiece) l vs

the addSegment function is defined earlier in the same file at line 224:

addSegment :: Escaped -> Link -> Link
addSegment seg l = l { _segments = _segments l <> [seg] }

I still need to check out what the Escaped and Link types are, but we are clearly leveraging Semigroup here with <>, so I think it is likely that this code is close if not directly related to why it defaults to [""] when an empty path is supplied to a CaptureAll api that expects text.

I will continue to investigate. It looks more intentional as I continue to dive into the source.

@domenkozar
Copy link
Contributor Author

MkLink is used when generating links, not when handling an HTTP request.

By empty path you mean when a client calls the API with nothing to capture the handler is given [""] instead of [] I assume?

Exactly.

If the behavior of CaptureAll is to produce a singleton list of mempty when no path is given

I'd expect it to be an empty list.

@guygastineau
Copy link
Contributor

Sure, I understand wanting it to be an empty list (another case of mempty), but from reading the test specifications it looks like the project is testing to ensure the opposite of the behavior requested here.

Thank you for feedback on my questions ;)

@domenkozar Do you know where in the source the behavior is programmed? I have been away from home, and without my computers since I last got on here, so no chances for further source diving yet.

@alpmestan
Copy link
Contributor

The v returned here: https://github.com/haskell-servant/servant/blob/master/servant-server/src/Servant/Server/Internal.hs#L221 is what gets passed to the user-supplied handler for the corresponding endpoint. You will quite likely want to play with that path parsing function to see what's going on when you pass it "/".

@guygastineau
Copy link
Contributor

@alpmestan thank you very much. I'll get to it this week before I start work and classes again ;)

@domenkozar I think I can get this fixed pretty soon. Thank you for opening an issue that is friendly to new contributors. I am excited to get involved with this project ;)

@alpmestan
Copy link
Contributor

@guygastineau Thanks for looking into this. :-)

@guygastineau
Copy link
Contributor

guygastineau commented Jan 12, 2020

Just a little update about where I am on working through this:

  1. I need to mess with parseUrlPieces from Web.HttpApiData in ghci separately from Servant. Since that function doesn't come from this project I guess I will need to post process it's output by plugging a polymorphic function in to the end of the route function defined in the link @alpmestan posted.
  2. I am getting additional tests sorted out first. This is my current revision of the CaptureAll specification (I will make a PR once things are further along).
type CaptureAllApi = "legs" :> CaptureAll "legs" Integer :> Get '[JSON] Animal                                        
                :<|> "arms" :> CaptureAll "arms" String  :> Get '[JSON] [Animal]                                      
captureAllApi :: Proxy CaptureAllApi                                                                                  
captureAllApi = Proxy                                                                                                 
captureAllServer :: Server CaptureAllApi                                                                              
captureAllServer = handleLegs :<|> handleArms                                                                         
  where                                                                                                               
    handleLegs :: [Integer] -> Handler Animal                                                                         
    handleLegs legs = case sum legs of                                                                                
      4 -> return jerry                                                                                               
      2 -> return tweety                                                                                              
      0 -> return beholder                                                                                            
      _ -> throwError err404                                                                                          
                                                                                                                      
    handleArms :: [String] -> Handler [Animal]                                                                        
    handleArms = mapM animalFromString                                                                                
      where                                                                                                           
        animalFromString :: String -> Handler Animal                                                                  
        animalFromString "tweety"   = return tweety                                                                   
        animalFromString "jerry"    = return jerry                                                                    
        animalFromString "beholder" = return beholder                                                                 
        animalFromString ""         = return beholder                                                                 
        animalFromString _          = throwError err404                                                               
                                                                                                                      
captureAllSpec :: Spec                                                                                                
captureAllSpec = do                                                                                                   
  describe "Servant.API.CaptureAll" $ do                                                                              
    with (return (serve captureAllApi captureAllServer)) $ do

      it "can capture a single element of the 'pathInfo'" $ do                                                        
        response <- get "/legs/2"                                                                                     
        liftIO $ decode' (simpleBody response) `shouldBe` Just tweety                                                 
                                                                                                                      
      it "can capture multiple elements of the 'pathInfo'" $ do                                                       
        response <- get "/legs/2/2"                                                                                   
        liftIO $ decode' (simpleBody response) `shouldBe` Just jerry                                                  
                                                                                                                      
      it "can capture arbitrarily many elements of the 'pathInfo'" $ do                                               
        response <- get "/legs/1/1/0/1/0/1"                                                                           
        liftIO $ decode' (simpleBody response) `shouldBe` Just jerry                                                  
                                                                                                                      
      it "can capture when there are no elements in 'pathInfo'" $ do                                                  
        response <- get "/legs"                                                                                       
        liftIO $ decode' (simpleBody response) `shouldBe` Just beholder                                               
                                                                                                                      
      it "returns 400 if the decoding fails" $ do                                                                     
        get "/legs/notAnInt" `shouldRespondWith` 400                                                                  
                                                                                                                      
      it "returns 400 if the decoding fails, regardless of which element" $ do                                        
        get "/legs/1/0/0/notAnInt/3/" `shouldRespondWith` 400                                                         
                                                                                                                      
      it "returns 400 if the decoding fails, even when it's multiple elements" $ do                                   
        get "/legs/1/0/0/notAnInt/3/orange/" `shouldRespondWith` 400                                                  
                                                                                                                      
      -- Test the string capturing half of the API                                                                    
      it "can capture a single element of 'pathinfo'" $ do                                                            
        response <- get "/arms/tweety"                                                                                
        liftIO $ decode' (simpleBody response) `shouldBe` Just [tweety]                                               
                                                                                                                      
      it "can capture multiple elements of 'pathinfo'" $ do                                                           
        response <- get "/arms/tweety/jerry"                                                                          
        liftIO $ decode' (simpleBody response) `shouldBe` Just [tweety, jerry]                                        
                                                                                                                      
      it "can capture arbitrarily many elements of 'pathinfo'" $ do                                                   
        response <- get "/arms/tweety/jerry/beholder/tweety/jerry"                                                    
        liftIO $ decode' (simpleBody response) `shouldBe`                                                             
          Just [tweety, jerry, beholder, tweety, jerry]                                                               
                                                                                                                      
      it "can capture when no elements are in 'pathinfo'" $ do                                                        
        response <- get "/arms/"                                                                                      
        liftIO $ decode' (simpleBody response) `shouldBe` Just [beholder]

Currently it is testing if the behavior is the same for Strings and Integers getting captured.

The first thing I have noticed about this:

  • The legs (Integer) branch only passes the no elements test when there is no / at the end of the url path.
  • The arms (String) branch only passes the no elements test when there is a / at the end of the url path.

EDIT: UPDATE - [""] matches for strings in the test when a / is at the end of 'pathinfo, but when it isn't there the expected empty list [] is returned. Somehow I am still having trouble matching on Integer with the trailing /, but I think this is what is happening:

  1. The 0 in the case for handleLegs was passing [] not [0] (it is much clearer when a direct match is used without the indirection).
  2. A trailing / in the 'pathinfo is somehow triggering a singleton list of what appears to be mempty (but I cant match on it for Integer so I don't what is getting returned there).
  3. I should write some more tests that check paths like "/arms////"
  4. Luckily now I clearly see this as a bug, and I realize that the behavior being tested by the originally specifications were expecting the behavior that @domenkozar wants !

If these quirks simply reside in parseUrlPieces, then it shouldn't be too hard to change, but if it is in the depths of Delayed and addCapture then I might need a few pointers.

Thanks for letting me store these progress notes up here ;)

@guygastineau
Copy link
Contributor

guygastineau commented Jan 12, 2020

Okay, I confirmed some of my suspicions. The following passed, but I imagine the specification would be for Just [jerry, tweety] to be the correct match:

      it "wont ignore empty slots in 'pathinfo'" $ do
        response <- get "/arms/jerry///tweety" 
        liftIO $ decode' (simpleBody response) `shouldBe`
          Just [jerry, beholder, beholder, tweety]

So empty path segments are getting turned into empty Strings. I still need to get to the bottom of what is happening with Integers then...

@guygastineau
Copy link
Contributor

After playing with parseUrlPieces I am pretty confident that this problem is happening wherever txts is getting made. I will look into that more tomorrow.

At least I am having fun and learning more about multiple libraries ;)

@guygastineau
Copy link
Contributor

I think I will finally have some time this weekend to look into this again. At least I left all that documentation of the journey and thought process above ;)

@guygastineau
Copy link
Contributor

Okay, I found this again. It has been a LONG time since I last looked into this. At least I have a lot more experience, and haskell build tooling is also more powerful/easier to use. I will take another stab at this. Wish me luck!

@tchoutri
Copy link
Contributor

@guygastineau Godspeed!

@guygastineau
Copy link
Contributor

I am only able to reproduce the buggy behavior when the "empty" path piece is in a sub route. That is, tests for / result in empty lists, but in a route like /legs/ it becomes [""]. Previously I thought it wasn't happening for the Integer based route, but that is not true. The existing tests summed the list, and I think they spuriously accepted [0] as [] since sum [0] = sum [] = 0.

This problem is happening before parsePathPieces. I think it must be happening in the code that processes a Delayed, but I have not found the actual code managing the pathinfo yet. Anyone is welcome to take a look at my new tests in this branch

@guygastineau
Copy link
Contributor

guygastineau commented Jan 25, 2022

It is actually happening in the usage of pathInfo :: Request -> [Text] in runRouter I think. I am almost there!

@guygastineau
Copy link
Contributor

guygastineau commented Jan 25, 2022

Okay, so I fixed it for routes that have a previoous path piece, ie. /legs/ => [] now. However, I think that a rooted capture all like will be broken, because // will yield [] instead of [""]. I will make some tests to confirm this. I will also check out the 1pathinfo function from wai, because really I think this is a bug in that function. This is handled in the CaptureRouter too, and I think it should be not handled in servant-server at all. Anyway, after some more tests to prove what I think is happening I will make a PR, and maintainers can weigh in on what they think we should do :)

@guygastineau
Copy link
Contributor

Actually, I think I arrived at the solution everyone would want :)

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Jan 30, 2025
0.20.2
----

- Fix build of examples/greet.hs. Add "429 Too Many Requests" error. [#1591](haskell-servant/servant#1591)
- Full query string helpers [#1604](haskell-servant/servant#1604)
  This involves a new instance `HasServer (QueryString :> api) context`.
- Add `MkHandler` pattern synonym [#1732](haskell-servant/servant#1732) [#1733](haskell-servant/servant#1733)

  Add a bidirectional pattern synonym to construct `Handler a` values from `IO
  (Either ServerError a)` ones, and match in the other direction.
- Add instance `HasServer (EmptyAPI :> api) context` [#1775](haskell-servant/servant#1775)
- Bugfix - CaptureAll produces [""] for empty paths due to trailing slash. [#1243](haskell-servant/servant#1243) [#1516](haskell-servant/servant#1516)

  CaptureAll resulted in `[""]` for empty paths due to trailing slash.  Similar
  oddities occurred around these edge cases like `"/"` resulted in `[]` correctly,
  but `"//"` resulted in `["", ""]`.  This patch simply eliminates the first `""`
  in the pathinfo list as taken from the wai response.  This might break user
  code that relies on personal shims to solve the problem, however simply removing their
  workarounds should fix their code as the behavior is now sane.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants